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Abstract 


Speeding up inferences made from large knowledge bases is a key to scaling up knowl- 
edge based systems. To do so. a system must have the ability to automatically iden- 
tify and ignore information that is irrelevant to a specific task. Identifying irrelevant 
knowledge is also key to enabling reasoning in environments in which several systems 
(and their respective knowledge bases) interoperate. This dissertation considers the 
problem of reasoning about irrelevance of knowledge in a principled and efficient man- 
ner. Specifically, it is concerned with two key problems: (1) developing algorithms 
for automatically deciding what parts of a knowledge base are irrelevant to a .query 
and (2) the utility of .'relevance .reasoning. 

As a basis for addressing these problems, we present a formal framework for analyz- 
ing irrelevance. The framework includes a space of possible. definitions of irrelevance, 
based on a proof theoretic analysis of. the notion. Within the space of definitions, 
we identify the class of strong irrelevance claims, that has two desirable properties. 
Strong irrelevance claims can be efficiently.derived automatically and are guaranteed 
to lead to savings in inference. 

The dissertation describes a novel tool, the query-tree , for reasoning about irrel- 
evance. Based on the query-tree, we develop several algorithms for deciding what 
formulas are irrelevant to a query. These algorithms dramatically speed up inference, 
especially when the knowledge base includes a large data base of ground facts. The 
query-tree has been investigated primarily for Horn rule knowledge bases with inter- 
pretable constraints (e.g., order and sort constraints), and several more expressive 
extensions. For Certain cases, the algorithms are shown to be complete, in that they 
detect all the irrelevant formulas. An important aspect Of the query-tree is that it 
can be built by examining only a small part Of the knowledge base (e.g., only the 
rules), and therefore, can be built efficiently. The query-tree is also used to derive 
the consequences of irrelevance knowledge given by a user. The dissertation presents 
an empirical analysis of the algorithms when doing backward chaining on Horn rules, 
showing that in practice, significant savings (often orders of magnitude) are Obtained 
by relevance reasoning. 

Our general framework sheds new light on the problem of detecting independence 
of queries from updates. We present new results that significantly extend previous 
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work in this area. The framework also provides a setting in which to investigate 
the connection between the notion of irrelevance -and the creation of abstractions. 
We propose a new approach to research on reasoning with abstractions, in which we 
investigate the properties of an abstraction by considering the irrelevance claims on 
which it is based.. We demonstrate the potential of the approach for the cases of 
abstraction of predicates and projection of predicate arguments.. 

Finally, we describe an application of relevance reasoning to the domain of mod* 
ding physical devices. We consider the task of selecting a model for a device and a 
query by composing model-fragments, each describing single phenomena in the phys- 
ical world at different levels of abstraction and approximation. We present a novel 
model-composition algorithm based on irrelevance that composes a model with ap- 
propriate abstractions and perspectives for answering the query. 
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Chapter 1 
Introduction 


The distinguishing characteristic of research in Artificial Intelligence (AI) is that it 
attempts to automate cognitive tasks that are natural to humans and at which.huffians . 
are proficient. Prime examples of such .research include computer vision, natural 
language understanding, automatic planning and the formalization of common sense 
reasoning. In performing cognitive .tasks, humans have the natural ability to ignore 
irrelevant information. We have constant access to very large amounts of information, 
either in our memory or through external, sources. However, when given a specific 
task, we are able usually to focus.on.the knowledge t hat is relevant to that task, thus 
enabling us to reason in a. timely fashion. . 

In .order for machines to be able to reason efficiently in the presence of large 
amounts of information, they too must be able to ignore irrelevant information. In 
fact, the inability of current AI systems to ignore irrelevant information is a major 
obstacle in scaling up such systems. It is well known that the performance of inference . 
engines in A I systems that use declarative representations degrades quickly as the size 
of the knowledge base increases. Two of the major sources of inefficiency of inference- 
engines are due to this problem: 

• In its search for a solution, the inference engine Considers many facts in the 
knowledge base that are irrelevant to the query. Consequently, it spends signif- 
icant effort pursuing useless solution paths. 

• A knowledge base is designed to accommodate a variety of tasks. Therefore, its 
conceptualization of the domain must be detailed enough for all -of them. Con- 
sequently. piveri a specific task the knowledge base is likely to be too complex 
for it, katij?jg to inefficient reasoning; For example, it may make unnecessary 
distinctions between objects in the domain or between properties of these ob- 
jects. In order to achieve efficient performance, an inference engine must be able 
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to abstract automatically the representation bv removing irrelevant distinctions 
in the representation. 

Both of these issues will become even more important in the context of future 
large scale AI systems (e.g., [Fikes et ai. 1991; Genesereth, 1992]). Such systems will 
have access to large amounts of knowledge coming from multiple autonomous sources. 
The knowledge will overlap in many ways and will be represented in multiple levels 
of abstraction. Reasoning mechanisms in such systems must.be able to decide auto- 
matically what knowledge is relevant to a specific task, and what level. of abstraction 
is most adequate. 

To illustrate these issues, consider the following simple example knowledge base. 

flight{X, K C) A (5 < S t ) A (£>£,) s* path{ A\ V, 5, £, C) 
bus{XA'\S l) E u C)A{S.<S l )A(E > Ex) =* path(X,Y.S< E,C) 
path(X,Z,S t E x ,Ci).A path{Z. V, £, C 2 ) A {Cx + C 2 < C) pgth{X,Y, S< E,C) 
flight(X , Y\ 5, E , C) =* (G.> 70) 

The atom flight(X, K, S, E,C) (bus(X, Y y 5, E,C)) denotes that there is a direct 
flight (bus) from city X to city V', departing at time S and. arriving at E. The cost 
of the flight is.C dollars. -The atom path(X , V’, 5. E,C) denotes that .there is a path 
(i.e., sequence of .flights and busses) from A' to V', .leaving and arriving between 5 
and E and costing at most C dollars. Finally, all flights are known to cost more than 
$70. The knowledge base also .Contains ground atomic facts for the relations flight 
and bus. 

Suppose we are given the query path(SF, LA, 8am, 4pm, $50). With respect to 
this query, the first rule in the knowledge base and all the flight ground facts are 
irrelevant. Ground facts of bus that cost more than $70 or do not run between 
Sam and 4pm are also irrelevant and can therefore be ignored. Doing so will result 
in significant savings in answering the query. In contrast, a conventional backward 
chairier reasoning with this knowledge base will encounter irrelevant facts at various 
points in its search; In the best scenario, it will immediately realize that a fact is 
irrelevant (by propagating the constraints) and backtrack. Otherwise, it will continue 
its search producing a search subspace based on the use of an irrelevant fact, and 
realize later that the subspace could be eliminated. Even if the backward chainer 
does realize immediately that a fact that it encounters is irrelevant, there may be 
many such irrelevant facts, and considering each of these will be very expensive. 

Alternatively, suppose we only want to know if there exists some path between two 
cities using the connections in our knowledge base. In such a case, we can abstract 
the representation of the domain arid modify the rules appropriately. Specifically, the 
predicates can be reduced from arity 5 to binary (e.g., flight(X,Y) denoting that 
there is a flight between .V and V'). Moreover', the distinction between flights and 
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busses is also irrelevant, and therefore we can abstract the distinction between flight 
and bus (and other travel media we may have). In doing So, we would replace every 
flight (and bus) ground fact by a ground fact of a new predicate directConfiection. 
Our knowledge base would now be 

directCdnnection[X<Y ) =s> path{ AW) 
path{ A\ Z) A path(Z , V') =$> path(X. V. 5, E, C) 

This knowledge base will yield a much smaller search space and will still enable 
us to answer the query. 

Aside from its use in controlling inference, the need to identify irrelevant knowledge 
also arises in other contexts in AI: 

• Nonmonotonic reasoning: in nonmonotonic reasoning, a conclusion drawn . 
from a set of formulas is. not guaranteed .to hold .when. additional formulas are 
considered. Consequently, the inferences made depend in subtle ways on which 
formulas are considered. A key property that has been the focus of several nom 
monotonic formalisms (e.g., [Pearl, 1990; Geffner and Pearl, 1990]) is designing 
reasoning schemes in which the addition of irrelevant formulas does not change 
the conclusions. However, the notion of irrelevance has been treated informally 
thus far in this work. 

• Reasoning by analogy: Often, properties of one object can be used to con- 
clude properties x>f. another, if there is some analogy between the two objects. 
However, for the reasoning to be meaningful, the analogy between the objects 
must be relevant to the property being concluded. Automating such reasoning 
requires a good understanding of the notion of relevance. 

• Learning: A drawback of many learning systems is that they produce overly 
specific descriptions of Concepts being learned. This happens when the learned 
descriptions contain irrelevant information. Using overly specific concept de- 
scriptions often degrades the preformance of systems (e.g., EBL). Removing 
irrelevant information is key to making such concept descriptions useful in prob- 
lem solving [Etzioni and Minton, 1992]. 

This dissertation studies the issues involved in reasoning about irrelevance. It 
presents a general framework and specific methods that enable a system to reason 
about irrelevance of knowledge to a query. Relevance reasoning is done both by using 
additional knowledge specified by the user and by automatic methods for analyz- 
ing the knowledge base and a Specific query. Additional knowledge is specified to 
the system in the form of meta-level irrelevance claims in a language given in the 
framework. 
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1.1 Components of the Problem 

We break down the problem to the following components: 

1. As a basis for stating knowledge about irrelevance and reasoning with it in a 
principled manner, we must: 

• Formally define the meaning (or meanings) of irrelevance. 

• Identify the different types of irrelevance with which we want to reason. 

• Devise a language for expressing knowledge about. irrelevance. 

2. In reasoning about irrelevance, we consider two questions: 

• Given .a knowledge base and a query, can we .decide automatically which 
facts in the knowledge base, are irrelevant to the query (and can we do so 
efficiently)? 

• How .can we derive logical conclusions from meta-lev&i irrelevance claims 
that are giyen to the system? 

3. Using irrelevance reasoning to control inference: 

• How can we modify inference mechanisms to exploit knowledge about ir- 
relevance? 

• What is the utility of relevance reasoning (in theory and in practice)? 

1.2 Overview of the Solutions 

We present an overview of the solutions we propose for the questions we address as 
well as a description of an application of our framework to the problem of selecting 
models for physical systems; 


1.2.1 Analyzing Irrelevance 

The notion of irrelevance has been, used in many contexts in research in AI and 
related fields. HoWever, most of the time researchers use the term informally. Formal 
analyses of irrelevance have been discussed by philosophers as early as [Keynes, 1921], 
[Carnap, 1950] and [Gardenfors, 1978]. The main thrust Of these analyses was to try 
to capture our common sense notions of irrelevance by a formal definition. Most of 
the work focuses on formulating properties of the notion of irrelevance and finding 
definitions that satisfy the properties. Consequently, the work has not been concerned 
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with how to use irrelevance for speeding up inference or how to design algorithms for 
detecting irrelevance. 

Within A I the notion of irrelevance was investigated in the context of proba- 
bilistic reasoning [Pearl. 1988] and used there to control inference in Baysian belief 
networks. In the context of logical knowledge bases, Subramanian [Subramanian, 
1989] investigated several formal definitions of irrelevance. However, the issues of 
deriving irrelevance claims and the utility of irrelevance reasoning were left largely 
open. 

We want our definitions of irrelevance to make sufficient distinctions to make them 
useful in developing algorithms for detecting irrelevance. To do so, we analyze irrele- 
vance at the level of the possible derivations (or more generally, solution paths) that 
a problem solver can pursue in the solution of a goal. In contrast, other analyses have 
been at the model theoretic level [Gardenfors, 1978] Or the meta-theoretic level [Sub- 
ramanian, 1989]. Furthermore, we do not purport to provide a single best definition 
of irrelevance. Instead, v.e provide a space of possible definitions of irrelevance and 
analyze how the properties of irrelevance change as we move in the space. 

We begin by considering the question of defining irrelevance of a. single formula 
to a query. Specifically, if A is a knowledge base, q is a query and / is some.formula,. 
we will define when / is. irrelevant to q with respect to A. . 

The first distinction made, in our space is between weak irrelevance and strong 
irrelevance. .In .the former, / will be irrelevant to q if there is some derivation of q 
that does not use /. In strong irrelevance, f will be considered irrelevant if it is not 
used in any derivation of q frOm A. Each of t hese classes can be further refined. by 
considering only a specific set of derivations in ihe definition. For example, we can 
define / to be strongly irrelevant to q if it is not used in any minimal derivation Of 
q. 1 Furthermore, definitions vary in the way we define what it means for a formula 
to be used in a derivation. For. example, we can define / to be used in a derivation D 
if it appears somewhere in D, Or, alternatively, if it is implied by the formulas in D. 

Besides irrelevance of formulas, the framework also considers irrelevance of other 
subjects. For example, we define irrelevance of predicates, objects, refinements of 
predicates and distinctions between objects. These kinds of irrelevance are later used 
as justifications fOr creating abstractions. 

The framework has enabled us to make several important distinctions. For ex- 
ample, the class of strong irrelevance claims is shown to have several properties not 
shared by weak irrelevance. In many cases, it is possible to find all strongly irrelevant 
formulas efficiently. Furthermore, removing strongly irrelevant formulas is shown to 
speed up inference significantly and is guaranteed never to slow it down. Finally, 
several instances of strong irrelevance satisfy properties that have been argued to be 
desirable of a common sense notion of irrelevance In the philosophical literature. 

'Given some definition of minimality of derivations, 
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• ^ rartle ' vor ^ is shown to be general in that it encompasses definitions discussed 

lrt ke past. These include definitions given by Subramanian [Subramanian, 1989] 
and definitions given in analysis of databases [Srivastava arid Ramakrishnan. 1992] 
and [LIkan. 1990J. The framework provides important insights into the problem of 
detecting independence of queries from updates in databases, enabling us to develop 
new algorithms for solving the independence problem. 

1.2.2 Automatically Detecting Irrelevance 

A major focus of the thesis is the investigation of the problem of automatically de- 
ciding which formulas are irrelevant to a given query. We first address the following 
question: 6 

• Given a knowledge base A and a query g t which formulas in A are irrelevant id 
<?? . 

We. later use the techniques developed .to answer this question in order. to solve 
the problem of deriving logical conclusions from irrelevance claims that are.™ 
to the system by an external source. We consider the problem for knowledge bases 
containing Horn rules, and several more expressive extensions. 

In general, deciding which formulas are irrelevant to a given query can be more - 
expensive than solving the query itself (without relevance reasoning), especially in 
large knowledge bases. . Furthermore, .if the knowledge base changes, the' relevance 
reasoning needs to be repeated. In order for our algorithms to be of practical interest, 
we must derive irrelevance claims by examining only a small and stable part of the 
KB, and derive claims that will hold independent of any changes that are made to 
other unexammed parts. In many applications using Horn rule knowledge bases it is . 
the case that the bulk of the KB is ground facts, and the ground facts are much more 
prone to frequent changes. Often, the ground facts will be stored in some database. 
Therefore vve address the following question. Suppose a knowledge base consists of 
a set oi rules V and a set of ground atomic facts D. 

• Given a set of rules V and a query q, which rules in V and which Sets of ground 
facts are irrelevant to q for any choice of ground facts D 9 

We consider the problem for several cases of strong irrelevance. For weak ir- 
relevance, the problem is in general undecidable even for simple languages (e.g no 
function symbols or recursion). Algorithms that provide sufficient conditions for weak 

irrelevance are discussed in Chapter 5. The following example illustrates the reasoning 
we perform. ° 
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Example 1.1: Consider the following Set of rules: 

rj : badPoint(X) A patk( A\ '') A goodPoint(Y) => goodPath{X,Y). 

r 2 : link(X , V') =>• path( X, V). 

r 3 : link{X, Z) Apath{Z.Y) => path[X . V). 

r 4 : step(X. Y) => link{ X. Y). 

r 5 : big$tep{X . V) =* /mA;(A\ V). 

The predicates step and bigStep describe single links between points in a space. 
The predicate path denotes the paths that can be constructed by composing single 
links. The predicate goodPath denotes paths that go from bad points to good ones. 
A knowledge base contains these rules and various ground facts using the predicates 
step , bigStep , goodPoint and badPoint. Furthermore, we are given that all the ground 
facts that may appear in the knowledge base satisfy the following constraints: . 

badPoint(X) => 100 < X < 200. 
slep(X\Y.) =► X < V- 
goodPoint{ X) => 150 .< A' < 170. 
bigStep(X , Y) => X < 100 A Y > 200. . 

Figure 1.1 is a symbolic representation of the possible derivations of facts, of the. 
form goddPath(X,Y). By analyzing the structure of the rules and the constraints 
appearing in them, it can be seen that rule r 5 will not appear in any derivation of the 
query, and is therefore strongly irrelevant. Similarly, ground facts in the knowledge 
base of the form step(X , Y) that do not satisfy 100 < X and Y < 170 are also strongly 
irrelevant to the query. I 

The main difficulty in irrelevance-reasoning is that we need to establish properties 
of ail the possible derivations of the query. However, we need to do it without enu- 
merating all the derivations. To do so, we have developed a novel tool, the query-tree. 
The query-tree is a finite ANDiOR tree that symbolically encodes all the possible 
derivations that can be generated for the query from the given set of rules (the query- 
tree of Example 1,1 is shown in Figure 1.1). In building the query-tree, we need to 
address two issues. First, a simple minded top down construction of the tree will 
not terminate if the rules in the knowledge base are recursive. Therefore, we need 
some principled method to terminate the expansion of the tree. Second, we need to 
carefully manipulate the interpretable constraints that appear in the rules in order 
to be able to derive all the irrelevance claims. 

The query-tree generation algorithm addresses these issues by attaching a set of 
labels to nodes in the tree. For example, in Figure 1.1 the labels of the nodes describe 
the constraints that need to hold on instances of that node in valid derivations. We 
assign labels to nodes in the tree as we expand it, and we only expand a goal-node 
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goodPath(X.Y). {100 < A' < Y < 170. Y > 150} 

n 

, t dP T t[X) 2^) gdodPoinl(Y) 

{ 100 < A < 170} {100 < A' < Y < 170. V > 150} {150 < V < 170} 


'’2 


r 3 


{100 < X < Y < 170. Y > loO} | 

link(X, Y) U Jl nk S X - f ^path(Z. Y) 

< {100 < A.|< Z < 1/0} {100 < Z < V < 170. y > lo 

u/e ltiTL r -» 


I 

>*4 

step(X, V) sie P(X, Z) {100 < X < Z < 170} 
{ lOO < A < V < 170. Y > 150} 


150} 


Figure 1.1: .An example query-tree 

in the tree if there is no other node that is expanded and has an isomorphic label. 
The labeling scheme Of. nodes in the tree is chosen to satisfy two constraints. First, 
the number of possible labels must be finite. This property guarantees .that the 
construction of the tree will terminate. Second, the labeling scheme is. chosen such 
that the resulting tree will encode precisely the set of desired derivations.- 

If the query-tree encodes precisely the set of derivations of interest, it provides a. 
basis for a sound and complete inference procedure for strong irrelevance. Specifically, 
a rule is strongly irrelevant to the query if and Only if it does not -appear somewhere 
in the tree. For example, in Figure 1.1, rule r 5 does not appear in the tree, and is 
therefore strongly irrelevant to goodPath(X % Y). A ground fact is strongly irrelevant 
if and only if it does not match some node in the tree. In Figure 1.1. ground facts 
of step(.\, Y) for which A" < 100 will not match any node in the query-tree and are 
therefore strongly irrelevant. 

We show that we can devise labeling schemes for nodes in the tree that enable 
us to encode precisely dll derivations (and therefore all strong irrelevance claims) 
for function-free Horn rules that allow a wide class of interpretable constraints (e.g., 
order constraints, sort constraints). We also discuss a labeling scheme that enables 
us to encode precisely all the minimal derivations of the query. Finally, we discuss 
a labeling scheme for encoding precisely all valid derivations of the query when rules 
may have, limited forms of negation in their antecedents. 

In some cases (e.g., recursive rules with function symbols) it is not possible to . 
devise an appropriate labeling scheme, and therefore, the query-tree encodes a su- 
perset of the valid derivations. In these cases, the query-tree provides only a sound 
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inference procedure for strong irrelevance. This means that if a rule (or ground fact) 
does not match some node in the query-tree then it is strongly irrelevant to the query; 
However, if it does match, that does not necessarily imply that it is not strongly ir- 
relevant, Consequently, using the query-tree enables us to remove only a subset of 
the strongly irrelevant formulas in the knowledge base. 

An important aspect of the query-tree is that it can be built efficiently (and 
the. -fore strong irrelevance can be derived efficiently). The size of the query-tree is 
linear in the number of rules in the knowledge base. 2 Its size depends on the number 
of different labels we attach to nodes in the tree. This number may be exponential 
in the arity of the predicates in the I\B. However, arities of predicates tend to be 
very small in practice (e.g., frame systems usually employ only binary predicates). 
Furthermore, finding examples in which the exponential running time occurs requires 
careful crafting of the rules. Moreover, since the query-tree is built only based on the 
rules, it need not. be recomputed when the ground facts change^ Therefore the cost 
of building the tree carl be amortized over many queries. . 

The- query-tree is related to. several graph-like structures discussed in the liter- 
ature, such as connection graphs [Kowalski, 1975], problem space graphs [Etzioni, 
1993], compilation graphs [Bruynooghe it .a/., 1919] and rule-goal graphs [Ullman, 
19.89]. The- main. property distinguishing the query-tree from other structures is the 
principled treatment of recursion and interpretable constraints. As a result, it is the 
only structure that computes the tightest constraints on the possible ground facts that, 
appear in derivations, and therefore only the quer.y-tree provides a complete inference 
procedure for strong irrelevance. Second, the method. of building. the query-tree is. 
more general than the methods used for building Other structures, and therefore we 
are able to extend it to encode other sets of derivations, (e.g., the set of minimal 
derivations). 

1.2.3 Using Irrelevance to Control Inference 

We investigate several methods of using irrelevance reasoning to speed inference: 

1. Remove irrelevant formulas: The first method is a. simple usage of the query- 
tree. Given a query (or class Of queries) we build its query-tree and dechle which 
formulas are not strongly irrelevant to the query. We then ignore the irrelevant 
formulas when solving queries of this class^bv building a specialize d index only on 
the relevant formulas. 

2. Ignore irrelevant solutions paths: Aside from encoding only the relevant rules 
and ground facts, the query-tree also encodes all the sequences of rule applications 

2 More precisely, it is linear in the number of rules that are Connected to the query through a 
simple reachability analysis. 
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that can lead to answers to the query. We show how to modify a backward chainer 
so that it follows only these sequences. 

We describe the results of experiments designed to measure the impact of these 
savings in practice when performing backward chaining on Horn rules. The experi- 
ments show that significant savings are achieved by creating specialized indices, while 
the cost of building the query-tree and of building the indices is insignificant. More- 
over. the results suggest that these methods will scale up to large? knowledge bases 
arid will be especially effective there. For instance, in Example 1.1 when the query- 
tree deemed 65% of the ground facts in the knowledge base irrelevant to the query, 
inference was sped up by a factor of 15. When 80% of the ground facts were deemed 
irrelevant the speedup grew to a factor of 90. 

We also discuss how the query-tree can be used to extend more sophisticated query 
evaluation schemes such as message-passing schemes [VamGelder., 1986] and magic 
set-transformations [I’llman, .1989]. 

3. Detecting irrelevant updates: A frequent operation in persistent knowledge 
bases is recomputing a query after an update is made to the knowledge base. However, 
in many cases this computation is wasted because it can be shown that .the update 
will not affect the query, even without actually computing it. We discuss, how to 
detect independence of queries from updates by formulating the problem in terms 
of relevance- reasoning.- We show how to use the query-tree and other techniques 
developed in Chapter 5 to detect independence efficiently. 

4. Automatically creating abstractions: As stated earlier, having a more par- 
simonious representation of the problem domain will lead to more efficient inference. 
Abstracting a representation to eliminate irrelevant distinctions will result in more 
parsimonious representations. We show how the creation of abstractions can be posed 
as a task of relevance reasoning, based on the intuition that. a good abstraction is one 
that removes irrelevant details. We present several algorithms for automatically cre- 
ating abstractions, based on algorithms for detecting irrelevance. 

1,2.4 Irrelevance Reasoning in Automated Modeling 

An important domain in which. relevance reasoning plays a key role is the domain of. 
modeling physical systems. We apply the framework in this domain to the task of 
automatically selecting a model for a given. system that is appropriate for answering 
a given query. . Briefly, the problem We consider can be formulated as follows. The. 
input consists of three elements: 

• Domain theory 

• A system description. 
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• A query about the system. 

The domain theory consists of a set of model fragment* [Falkenhainer and Forbus, 

1 99 1] , Each model fragment describes a single phenomenon in the physical world. For 
example, a model fragment may describe the dependence of the voltage of a battery on 
its charge level, or it may describe the process of fluid flow through a pipe connecting 
two containers. The same phenomenon may be described by several model fragments, 
that differ in the level of detail and the abstractions and approximations made. Each 
model fragment has a set of operating conditions stating when it is applicable. An 
adequate model of the system is a set of model fragments from the domain theory 
that have consistent Operating Conditions. 

The system description is a set of facts about the system, including structural 
specifications and initial values on some of the system parameters. The query is some, 
parameter (or set of parameters), and the answer to the query is a description of the 
changes of that parameter over time, 

The output of the model formulation problem is a.set.of model fragments from the 
domain theory that can be used to answer the query about the device. The goal of the 
model formulation problem is to find the simplest model that can explain coherently 
the value of the query parameter over time. 

Our approach to the problem is based on the observation that several of its aspects 
can be viewed as instances of relevance reasoning. First, in order to decide which 
phenomena need to be included in a model of the device, we need to determine 
which, aspects of the domain are relevant to the specific query. This requires that 
we follow' the possible causal influences on thequery parameter, Second, in selecting 
among multiple mode! fragments representing different ways of modeling a specific 
phenomenon, we need to reason explicitly about the assumptions being made by each 
model fragment. We show that many of these assumptions can be stated as irrelevance 
claims about some aspects of the domain. Finally, the focus of our application is to 
select a model for simulation of the device over time. However, we do not know all 
the conditions that may arise in the states of the simulation. Therefore, our task is 
complicated by the fact that we need to select a model based only on partial knowledge 
about the states that may occur in the simulation, This is analogous to the problem 
faced by the query-tree of deciding irrelevance by inspecting only a small part of the 
knowledge base. 

We describe a novel model formulation algorithm based on the above observa- 
tions, The algorithm selects the phenomena that can affect the query by following 
the possible causal influences oh the query parameter. After selecting a phenomenon 
to include in the model, it chooses among the multiple ways of describing the phe- 
nomenon by reasoning explicitly about the different assumptions distinguishing the 
various descriptions, and choosing the simplest one that does hot contradict assump- 
tions made earlier. In making the selections the algorithm also incorporates domain 
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specific knowledge that may be available from Experts-. This domain specific knowb 
edge is expressed using the language provided iii the framework. 

The algorithm has several desirable properties. First, it is guaranteed to select an 
adequate model for the simulation. Second, given the partial knowledge it has about 
the possible states of the simulation, it selects the simplest possible model. Finally, 
the running time of the algorithm is polynomial in the size of the resulting model. 

The algorithm has several advantages over previous algorithms [i\ T ayak, 1992a; 
Falkenhainer and Forbus, 199T, Addanki ft a/., 19891. First, it addresses the problem 
of formulating a model for simulation without creating a complete envisiortment of 
the possible states (as in [Falkenhainer and Forbus, 1991]). Second, the following of 
possible causal paths by the algorithm frees the user from specifying possible causal 
interactions explicitly (as in the component interaction heuristic (Nayak, 1992a]). This 
advantage is important since specifying these interactions is a laborious and error- 
prone task.. Finally, unlike- the algorithm proposed by Nayak [Nayak, 1992a] that 
begins with the most complicated model and .iteratively simplifies it, our algorithm 
starts with the. simplest model possible and makes it more complex only as required 
by the modeling assumptions. Consequently, our algorithm is more likely t o scale up 
to larger devices.. 


1.3 Contributions of the Thesis 

The thesis makes the following important. contributions: 

• It presents a general proof theoretic framework for analyzing irrelevance. The . 
framework crystalizes the issues involved in relevance reasoning; it makes sev- 
eral important distinctions in the analysis of irrelevance, and it generalizes and 
unifies previous analyses. 

• It presents a novel tool, the query-tree, for reasoning about irrelevance. The 
query-tree is used for: 

- Developing efficient algorithms for detecting strong irrelevance. The al- 
gorithms are sound and complete for knowledge bases containing function 
free Horn rules with a wide class of interpretable constraints. The query- 
tree -can also be designed to be complete for deriving strong irrelevance 
restricted to minimal derivations and for rules having limited forms of 
negation in their antecedents. For arbitrary Horn rule knowledge bases, 
the query-tree provides a sound inference procedure for strong irrelevance. 

- Pushing the tightest constraints possible from a given query to the ground 
facts in the database. Consequently, a filter can be applied to the ground 
facts before evaluating the query. 
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- Algorithms for deriving logical conclusions' from irrelevance claims that are 
given to the system by an external source. 

- A backward chaining algorithm that is guaranteed not to pursue useless 
paths. 

• It describes experimental results showing that in practice, relevance reasoning 
leads, to significant speedup of inference. 

• It describes an application of the framework to the problem of detecting inde- 
pendence of queries from updates, resulting in: 

- New decidable cases for. detecting independence. 

- Novel algorithms providing sufficient conditions for independence. 

• It describes an application of the framework to the problem of modeling phys- 
ical systems. Along with providing important insights into that problem, it 
presents a novel algorithm for automated model selection for simulation based 
on relevance reasonings 

•. It presents a formal connection between relevance reasoning and reasoning with 
multiple levels of .abstraction. The connection, enables better analysis of the 
utility of reasoning with abstractions and the development of algojijthms for 
automatically creating abstractions for a given query. 

The main contributions of the thesis are in the fields of knowledge representation 
and control of reasoning. It is important to note some of the contributions that the 
thesis makes from the perspective of related fields. Some were outlined in the opening 
of this chapter, and several are mentioned below. 

Deductive databases and logic programming: Much of the work in the the= 
sis can be couched in the terminology of these fields. A major contribution of the 
query-tree is that it shows how to push constraints from the query to the database. 
Consequently, a filter can be applied to the database before query evaluation, lead- 
ing to significant savings. The query-tree and the algorithms discussed in Chapter 3 
make significant contributions to the optimization of query evaluation and of logic 
programs. Specifically, the query-tree can be viewed as a method for partial evalua- 
tion of constraint logic programs, extending previous methods in this field. 
Knowledge acquisition and knowledge engineering: The framework discussed 
in the thesis provides a basis for acquiring knowledge about irrelevance,. both by 
providing an expressive language and by indicating where additional knowledge is 
required. The query-tree can also be used as a tool for knowledge acquisition by 
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illustrating the connections between pieces of knowledge in a knowledge base and by 
determining the effects of adding knowledge to the KB. 

Reformulation: Subramaniah [Subrafhanian. 1989] first analyzed irrelevance with 
the goal of. automating reformulations, The work. presented in this thesis advances 
Subramanian's analysis and suggests specific methods for discovering irrelevance.and 
creating abstractions. Therefore, it provides a basis for research on automatic refor- 
mulation. 


1.4 Readers’ Guide 

The thesis is organized as follows. Chapter 2 discusses the issues arising in analyzing 
irrelevance, and the proof theoretic framework we propose. It also discusses properties 
of irrelevance within the framework. Chapter 3 discusses the query-tree. It presents 
the general, met hod for building a query-tree and .describes several of its instances. ~ 
Chapter 4 discusses the usages of the query-tree in speeding inference. It presents 
details of experimental. results validating the impact, of the. methods. in practice. It 
also.describes algorithms for deriving logical conclusions from given irrelevance claims. 
Chapter 5 investigates the problem of detecting independence of queries froin updates. 
By translating the problem to reasoning. about irrelevance, it provides.new algorithms 
for detecting independence. Chapter 6 makes the formal conne-tion between relevance 
reasoning and the. creation of abstractions, suggesting a new approach to research on 
reasoning with abstraction. The chapter demonstrates the utility of this approach by. 
considering the examples of predicate abstraction and removal of predicate.arguments. 
Chapter 7 discusses the application of the framework to the domain of modeling 
physical systems. Chapter 8 concludes With a summary of the contributions, and 
directions for future research. 

The relevant related work is discussed at. the end of each chapter. Chapter 8 
contains a tabular summary of the references to related work made in the thesis. The 
thesis is written so that it can be read even by skipping the proofs, The proofs that 
are included in the chapters are only the most important ones or ones that can add 
to the understanding of the text. Others appear in Appendix A. 

Some of the material covered in the thesis appears in shorter conference length 
publications. The material in Chapter 2 and Section 4.2 appears in [Levy and Sagiv* 
1993a). Chapter 3 is covered in [Levy and Sagiv. 1992; Levy et at .. 1993]. The 
material of Chapter 5 is presented in [Levy and Sagiv, 1993b). Finally, the material 
of Chapter 7 is described in (Levy cl at ,, 1992: Iwasaki and Levyj 1993). 


Chapter 2 

Analyzing Irrelevance 


The notion of irrelevance is used in many contexts in AI research. However, it is 
typically used informally. Our goal is to state declarativelv knowledge about irrel- 
evance and to reason with, such knowledge in a principled manner. As a basis for 
pursuing this goal, this chapter presents a general framework for analyzing formal 
definitions of irrelevance. The framework is based on a proof theoretic analysis of ir- 
relevance. Section 2.1 begins by describing the motivations for analyzing irrelevance. 
Section 2.2 describes the issues that arise in the analysis of irrelevance and the pos- • 
sible approaches one can pursue. Section 2.3 describes our framework, consisting of 
a space of definitions of irrelevance. It also discusses properties of definitions in the 
space. .Finally, Section 2.4 formally presents the. problem of automatically deriving 
irrelevance claims. — 


2.1 Motivations for Analyzing Irrelevance 

The main goal of our analysis of irrelevance is speeding up inference, or, more gen- 
erally, problem-solving. Irrelevance analysis can be used in speeding up inference in 
two ways. First, by determining that certain formulas in the knowledge base are irrel- 
evant to a given query, the inference engine can ignore these formulas, and therefore 
prune its search significantly. Second, by identifying distinctions in the represen- 
tation that are irrelevant to a specific queryj we can create an abstract and more 
parsimonious representation. We can then translate the knowledge base into this new 
representation, resulting in more efficient inferencei 

The analysis of irrelevance is also important in other contexts in AI for purposes 
other than speeding up inference. In Some cases an understanding of irrelevance is 
needed in order to determine which inferences can be made, as the following examples 
illustrate. 
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1. Non monotonic reasoning: In non monotonic reasoning, the addition of 
knowledge can cause previous conclusions to be retracted. Consequently, con- 
clusions drawn in non rrionotonic reasoning formalisms depend in subtle ways 
on which knowledge is considered. The following example (from [Pearl, 1990]) 
illustrates that dependency: 

Example 2.1: Consider a knowledge base containing the following: 

Birds typically have wings. 

Birds typically fly , 

Penguins are birds, 

Penguins don't fly 

Suppose our query is: Do penguins have wings?. The difficulty in answering the 
query is that penguins are abnormal .with .respect to flying, and. therefore, may. 
be abnormal in other ways too, such as having wings. However, the fact that 
a specific bird is a penguin is irrelevant to whether it has wings .or not, and 
therefore we would like -to ignore, the abnormality of penguins and to conclude 
that penguins do have wings. 1 As another example, suppose our query is Can 
red birds fly?. Here too, we are asking about a property of a subclass of birds, 
which, as with the subclass of penguins, may be abnormal with respect to flying. 
However, the fact that a bird is red is irrelevant to its flying ability. I 

Designing non monotonic reasoning formalisms that are able to ignore irrele- 
vant information has received a lot of attention recently [Pearl, 1990; Geffner 
and Pearl, 1990; Bacchus et al . , 1993]. However, in that work the notion of 
irrelevance is either treated informally, or the definitions that are used are very 
simple minded. 

2. Reasoning by analogy: In reasoning by analogy we conclude properties of one 
object from properties of another, based on a possible analogy existing between 
the two objects. However, for the reasoning to be meaningful, the analogy 
between the objects must be relevant to the property being concluded. For 
example, suppose wc state the analogy: Fred is like a fire-engine. Intuitively, 
we may use that to conclude that Fred is loud, or that his activity level is high. 
However* it seems improper to conclude that Fred’s color is red, or that his fuel 
consumption is medium, Since these properties are irrelevant to the analogy 
between Fred and a fire-engine. 

‘It may be argued that having wings «s relevant to the ability of a bird to fly. However, this is 
not explicitly stated in the knowledge base. The first statement of this example can also be replaced 
by any property of birds that is completely unconnected to flying. 
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The notion of irrelevance also plays an important role in designing algorithms for 
abductive reasoning [Levesque. 1989] and for belief revision [Gardenfors, 1988]. 


2.2 Issues in Analyzing Irrelevance 

In this section we discuss several of the issues that arise in an analysis of irrelevance, 
find provide the motivations underlying our approach. Throughout this chapter we 
will be concerned with defining formula-irrelevance , i.e.. given a knowledge base of 
formulas A, a query q and a formula / (not necessarily in A), when do we say that 
/ is irrelevant to q with respect to A. Irrelevance of other kinds of subjects, such 
as predicates, objects, predicate refinements and object refinements are considered in 
Chapter 6. 


Two Possible Approaches: Common Sense Formalization vs. Problem 
solving Analysis 

Broadly, we distinguish two possible approaches to analyzing irrelevance. The first . 
approach, which has been pursued by several philosophers ([Keynes, 1921; Carnap, 
1950; Gardenfors, 1978]), is to try to capture our common sense notion .of irrelevance 
with a formal definition. In. that approach, .we would consider a formal definition of 
irrelevance and check. whether it satisfies properties which we consider natural for our 
intuitive notion Of irrelevance. 

The second approach is to analyze the ways in which irrelevance arises in problem- 
solving. Here too, we would consider various definitions Of irrelevance and investigate 
their properties. However, the properties of interest will be those that are informal 
live in designing inference methods that utilize irrelevance. To illustrate this point, 
consider the following example. 

Example 2.2: Suppose we have the following knowledge base Aoi 

7 'i : attendClass{X,Y ) => pass(X,Y ). 

r 2 : passExam( A', V') pass(A\ V'). 

j' 3 : pdss(A', V') A tdokGradCourse(X) =»* canTA{X , V). 

rq : pdss(X, Y) A (V' > 200) =* tdokGradCoursc{X). 

gi : attendC lass(Fred , 101). 

£2 : pdssExam{Fred , 101). 

03 : passExarh(Fred t 161 )j 

04 : pass Exam( Fred, 123). 

gi : pass Exam{ Fred, 202). 
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And let our query be q : canTA{Frfd . 101). 

Each of the ground atoms gi -g^ can be considered irrelevant to the querv in 
isolation, because for each. of them, the query can be derived without using them. 
However, there are differences between these irrelevance claims. For example, g 3 and 
9* n< -^ be P ai> t of any derivation of the query, and therefore can be both ignored. 

E\en though the query can be derived without g\ or <?j, one of them is always needed, 
and therefore, we can not remove both. The ground atom danTA{Fred< 201') can also 
be considered irrelevant to the query, but in a somewhat weaker sense. Although it 
is never part of any derivation of the query, it is always entailed by the formulas used 
in the derivation of the query. 

Consider the query canTA{ Fred, 202). The atom pass Exam{ Fred, 210) can be 
part of a derivation of this query (if it Were in the KB), but such a derivation would 
not be minimal , in. the sense that the set of ground atoms that it uses from the KB 
is not minimal (i.e., using g 5 is enough for entailing the query). Finally, the rule 

passExam(X,Y) A (>* > 200) =*> canTA{X,Y) 

could also be considered irrelevant,, since the. query can be .derived without.it. How- 
evet. for .some inference mechanisms, it may be the case the this rule will speed, 
inference, .since the query can Jie derived using one rule application instead .of two. I 

The analysis presented in this dissertation is based on the ways in which irrelevance 
arises in problem solving, for two reasons:. 

1. Our prime concern is speeding up inference, and therefore, we desire that our 
analysis provide the distinctions necessary - to exploit irrelevance in inference 
methods, such as those illustrated in the example. 

2. Moreover, even if we could agree on a single best formalization of our common 
sense notion of irrelevance, it will have many different manifestations in infer- 
ence depending on the form of the knowledge base and the inference method. 
It is therefore important to distinguish these manifestations in order to develop 
methods for speeding up inference; 

Clearly, these two approaches to analyzing irrelevance are not independent of each 
other. On the one hand, the analysis of irrelevance that we consider is inspired by 
our common sense notion of the concept, and the definitions we examine mirror it in 
various ways. We will also see that the distinctions made in our analysis correspond 
to properties of the common sense notion of irrelevance. On the other hand, given a 
formalization of the common sense notion of irrelevance, analyzing it in our frame- 1 
work will provide a Way of using it for speeding up inference. However, it should 
be emphasized that the approach we have taken is intended to be evaluated on its 
usefulness -for speeding up inference, not on how well it captures intuitive notions of 
irrelevance. 


2.2. ISSUES IS ASALYZISG IRRELEVASCE 


19 


Irrelevance With Respect to Given Evidence 

Much of the work on formalizing irrelevance (including the work in the philosophy 
literature) lias focussed on the following question: 

• Given a set of evidence E and query q , which formulas are irrelevant to q with 
respect to the evidence? 

In our analysis, the knowledge base acts as the evidence, arid therefore the question 
we address is the following: 

• Which parts of a given knowledge base A are irrelevant to q ? 

The difference between the two questions is that in the first, the set of evidence 
formulas is given special treatment by being given priority over the other fdrmulas. In 
the second,. the KB. A acts as our evidence; however, our goal is to. find which parts 
of the evidence- are irrelevant to -the. query. It may seem that the second question 
can be considered an instance of the first by equating A and £. 2 However, several 
assumptions made in addressing the first question (e.g., [Gardenfors, 1978]) make it 
impractical to use the solutions for the second question. For example, one assumption 
is that any formula / € S will be considered irrelevant to the query (since it is already 
known and does not change the .state of affairs). Another property considered is 
independence of the form of the evidence, i.e., if €\ is equivalent to £ j, th^ii. the 
formula / is irrelevant to q w.r.t. E x if and only if it is irrelevant, to q w.r.t. £ 3 . 
In Section 2.3.4 we will see that our framework is general enough to accommodate 
definitions of irrelevance that .give special treatment to a subset of evidence formulas. 

Irrelevance as Belief Revision 

Intuitively, a formula / is irrelevant to a query q with,respect to a KB A if / can 
be removed from A and q will still be derivable. This intuition can be generalized 
by relating irrelevance to the notion of belief revision. Specifically, a formula / is 
irrelevant to a query q w.r.t: a KB A, if the result of revising A so that it does 
not entail / does not affect its entailment of q. Formally, let o be a belief revision 
operator. Given a knowledge base A and a formula d>, A 0 <j> returns a (consistent!) 
revision of A that entails &, Using this operator, we can define irrelevance as follows : 3 

/ rrelevant{f, < 7 , A) 4=* ((Ao /){=<? 4==> (A q-<f} \= q) 

2 Another way to relate these two questions is by considering the set of evidence empty: However: 
in such cases, the proposed solutions to the first question trivialize, 

3 Note that if A is consistent, then either (A o /) or (A o -</) will be simply A U /. 
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Many definitions have been proposed in the literature for a belief revision operator 
(e.g., [Dalai, 19SS; Winslett. 1990: Fagin it al. . 1983: Nebel. 1989: AlchOurron et al. , 
19Soj). There is little agreement on a single best definition, and the properties that are 
widely considered desirable of such an operator (e.g.. the AGM postulates [Alchourron 
et al. , 1985] ) give us little information about the resulting properties of irrelevance. In 
this dissertation we address directly the notion of irrelevance. However, investigating 
connections between our analysis and belief revision is an interesting area of research. 

Our Approach 

In order for our analysis of irrelevance to be useful, we desire that it enable us to 
make sufficient distinctions to answer the following questions: 

1. Can we decide automatically which formulas are irrelevant to a given query? 
Can. we do so-efficiently? ’ 

2. If an irrelevant formula is removed, is inference guaranteed to be more efficient? 

3. How can we automatically derive additional irrelevance claims? For example, 
does irrelevance of a formula imply the irrelevance of a syntactically related 
formula?. 

In order. to capture the distinctions needed to answer these questions, we present 
an analysis of irrelevance in terms of the possible paths that, an inference engine 
may pursue in the solution of a query. We present a.space of possible definitions of 
irrelevance and investigate the properties of various definitions in the space. In our 
discussion, we focus on inference mechanisms that attempt to construct derivations 
of answers to the query, and therefore, paths are actually the possible derivations 
that the inference mechanism may consider in its search. However, the framework is 
general arid can accommodate other types of problem solving methods. An example 
of Other methods will be discussed in Chapter 7 when we consider reasoning about 
physical systems. As examples of definitions in our space, we may consider / to be 
irrelevant if there is some derivation of q that does not use /, Or if / is not used in 
any derivation of q. or if / is not used in any minimal derivation of q< The next 
section presents the space of definitions of irrelevance. 


2.3 A Space of Definitions 

2.3.1 Preliminaries 

Iri our discussion we assume the theory of the domain is represented by a knowledge 
base of closed formulas A, in first order predicate calculus. We assume that the 
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inference mechanism employs a set of sound inference rules S. A derivation D of 

a closed formula v from A is a sequence of formulas, qj ,Q n , such that a n — v 

and for each i (1 < i < n). either a , 6 A, q, is a logical axiom, or q, is the result 
of applying a rule in S to some elements ct,-, , . . . , q„ that appear prior to a,. The 
formulas a, a t( are said to be immediate subgoals of a,. The set of formulas 
in D that do not have any subgoal is called the base of the. derivation, denoted by 
Base{D). The set Base{D) represents a “support set" for v in the knowledge base. 
We c6nsider only derivations in which every a,- is a subgoal, of U' (not necessarily an 
immediate subgoal). 

A query is represented by a formula i/>. If U' is a closed formula (i.e., has ho free 
variables), then the answer is true if the inference mechanism can find some derivation 
of ti’ from A, and false otherwise. 4 If contains free variables, the answer is the 
Set of assignments for the free variables, such .that the resulting closed formulas are 
derivable, from A; 5 in this case, a derivation is a set containing a single derivation for 
each answer. A query may have several derivations from a given knowledge base, and 
we denote the set of those derivations by (note that if v* has free variables, „ 

then X>^,( A) is a set of sets of derivations). 

Our goal is to define the meaning of an irrelevance- claim stating that a formula 
<p is irrelevant to a query V with respect .to a knowledge base A. The. formula 4> is 
called the subject of the irrelevance claim, 

2.3.2 The Axes 

As stated, we describe a space of possible definitions of irrelevance. Definitions in 
the space vary along two axes. In the first axis we consider different ways of defining 
derivation irrelevance , i.e., irrelevance of a subject 4> with respect to a single derivation 
D of the query 4>. Derivation irrelevance is given by defining a binary predicate 
DI(<p , D). The following are a few examples of how DI can be defined: 

• DIi(4>, D) iff d> $ Base(D). 

• Dh{4>,D) iff <p$ D. 

• DI 3 {^D) iff Base(D) ^ o. 

• £>/ 4 (^>, D) iff Base{ D ) ^ <p, ~'4>, 

*We can return unknown if neither <p nor are derivable. However that does not affect our 
discussion. 

5 An alternative definition often considered (e.g., in Prolog) is finding one variable binding that 
satisfies the^uery formula. However, this distinction does not affect our discussion, 
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Definition DI\ requires that o not be in the support set of the derivation D. Definition 
Dli is stronger arid requires that o not be anywhere in D. Definition D/ 3 is even 
stronger and requires that o not be a logical consequence of the forrliulas in Base{D ), 
and DI 4 requires' that ->co not be a logical consequence either. The relationship 
between these definitions of Dl can .therefore be summarized as follows: 

Proposition 2.3: DU{o. D) => Z)/ 3 (o. D) => D/ 2 (6. D) => Dl x [o< D ). 

Requiring that Dl holds for all possible derivations of the query may be too re- 
strictive. Therefore, in the second axis we consider different subsets of the derivations 
of the query for which we require Dl to hold. Formally, given the possible deriva- 
tions of i>' from, A, £>,.,(A), we consider a subset 2> 0 (A) of I>^.(A)., (which may be 
Z\''(A) itself), and require that Dl hold for derivations in Po(A). For example, we_ 
can require D.l to hold only for the set of minimal derivations. In section 2.3.4. we 
consider several definitions of minimality for a derivation. As another example, we 
can consider only the set of. derivations bounded, by some. resource constraint. 

Given a choice for Dl and 2?cn we give two. definitions of irrelevance, depending 
on whether Dl is. required to hold for. all derivations in 2?o(A) or for some derivation 
in Po(A). 6 Formally, a definition of irrelevance in our space is_given as follows: 

Definition 2.4: Suppose we are given: - 

L a. knowledge base A., 

2. a closed formula <p (the subject), . 

3. a query t/>, 

4. a predicate DI(t,D) specifying when a formula r is irrelevant with respect to 
a derivation D , 

5. a subset 2?o(A) of T>^( A). 

The formula <p is said to be weakly irrelevant to with respect to A, Dl and T> 0 , 
denoted by W7(^, 0, A, DLVq), if DI{<p,D) holds for some D € V o(A). 

The formula is said to be strongly irrelevant to U> with respect to A, Dl and V 0 , 
denoted by Sl(<fr, i/\ A, D/,Z?o)i if Dl{<t>,D) holds for every D € T> o(A). 

If TV(A) is empty (i.e., t p is not derivable from A using <S), the formula <p is both 
weakly and strongly irrelevant to v-| 

6 We can also consider other Way's of quantifying over -the set X> 0 (A), such as requiring that Dl 
holds for some percent of the derivations in Po(A). Here we consider only universal and existential 
quantification. 
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In our discussion we want to refer to irrelevance of a set of formulas. Formally, 
we define irrelevance of a set of formulas by extending the definition of Dl: 

Definition 2.5: If 4> is a set of formulas. D/(4>. D) holds if DI{o,. D ) holds for every 

o, € <I>. I 

The definitions of strong and weak irrelevance remain unchanged. It will also be 
useful to state irrelevance claims that hold for a set of knowledge bases. For example, 
in the context of Horn rule knowledge bases, we will want to know whether a rule is 
irrelevant with respect to all the knowledge bases that differ only in ground atomic 
facts. We extend the definitions to sets of knowledge bases as follows: 

Definition 2.6: Let I be a set of knowledge bases. We say that o is weakly irrelevant 
to 4' with respect to E. denoted by H'/(c>. v\E, D/.P 0 ). if o is weakly irrelevant to 
V with respect to every KB in E. be., if lV/(<p. t/\ A. DL Po) holds for every A € — . 
The definition for strong irrelevance is extended likewise. Note that P 0 is actually a 
function that for every given A 6 I returns a subset of P t ,(A). I 



Derivation-irrelevance: Irrelevance 
w.r.t a single derivation, DI. 


Figure 2.1: A space of definitions of irrelevance. The first axis consists of different 
definitions of derivation irrelevance. The second axis consists of the set of derivations 
considered. Weak irrelevance and strong irrelevance differ on the way we quantify 
Dl, over the derivations chosen in the second axis. 

The space of definitions is summarized in Figure 2.1. In Example 2.2 presented 
earlier, we can see different kinds of irrelevance claims. The atom g \ (as well as the 
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atom g 2 ) is weakly irrelevant to the query q = can TA{ Fred, 101), since there is a 
derivation of q that does not use g\ (i.e.. uses g 2 instead). Consequently, WI[g 2 . 
q. A 0 , D/ 2 .P 9 ) holds. The atom 33 (us well as g A ) is strongly irrelevant to q . be-* 
cause none of the derivations of 3, use it. Consequently, 5/(33,?, A 0 . D/ 2 . P 9 ) and 
- s T({ 33-?4},?, Aq, D/ 2 ,P ? ) hold. 

The atom ?i = canTA{ Fred. 202) is Strongly irrelevant to q if we consider deriva- 
tion irrelevance based on D/ 2 . However, if we consider derivation irrelevance based 
on D/3, it is not strongly irrelevant, since the .formulas used to derive q can also be 
used to derive q\. 

Finally, if we consider the set of all derivations of the query q \ , P 91 , then the atom 
passExam[ Fred, 210) is not strongly irrelevant to the query, since it can be used in 
a derivation of qi (to derive tookGradCourse[Fred)). However, if we consider only 
deflations in which Base{D).\s minimal (i.e., there is no subset of Base[D) that is 
enough. to derive the query), then passExam( Fred, 210) would not be part of any 
derivation of q\. and would therefore be strongly irrelevant to it. 

2.3.3 Properties of Definitions In The Space 

Several general properties of definitions in the space will be useful in the analysis of 
specific definitions. The following lemma establishes an ordering, on definitions in the 
space, and will enable us to derive properties of definitions based on , properties of 
other definitions in the space. 

Lemma 2.7: Let D.l. DU and Dl } be definitions of derivation irrelevance . be a 
set of formulas, 0 be a query and E, Si and S2 be sets of knowledge bases. Finally, let 
Do,Di,D 2 functions that given a KB &.and a query rl>, return, a subset d/D^( A). 

1. If DU{t,D) =4> D/j(r, D) for any set. of formulas t and derivation D, then for 
any set of formulas query 0, set of knowledge bases S and Set Vo 

5/(4>, 0, E, D/„ D 0 ) ^ £/($. 0, E, Dlj, Vo) 

arid . 

H7($, 0, S, D/,-, P 0) * VV7($, 0. E, D/j, V 0 ). 

2. IfV i{A) C Pj(A), for any knoudedge base A € S, then for any set of formulas 
<E», query xl' and definition DI: 

5/(<MuS,D/J 2 2 ) =*» 5_/($,0,S,D/,Di) 

holds. For weak irrelevance , the opposite holds 

H7(«&.0.S.D/,D,) =* U7(4>,t/>,S,D/,P 2 ). 
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3. For any set 4 >, v. DL E ana Vo 

S/($,v,E,fi/.Z> 0 ) =*> U7(4>. ti\ 2. Dl,V 0 ). 

I IfE { C S 2 Men 

$/(<&. t\S 2 .Z>/,Z>b) =* S/(<Mv$i, £>/,£>b) 

WV(*.«'.S 4 .D/.23b) 1^7(0, t/', St, DI.Vo) 

Proof: The proofs follow straightforwardly from the definitions. Consider Part 1. 
Suppose SI($<il\E, Dl,,V 0 ) holds and let A € 2. Therefore, for every deriva- 
tion D 6 Po(A). DI,{$,D) holds and therefore, by the assumption of the lemma, 
DIj($,D) holds. Consequently, DI } ($,D) holds for every D € Z>o(A), and so 
5/(4>, i/\ S, DIj, Vq) holds. The proof for WI is similar. 

Part 2 about, strong irrelevance follows from the observation that if DI holds for 
ali derivations in the.set X> 2 ( A), it will hold also for all derivations in its subset V X (A). 
For. weak irrelevance, the claim follows from the observation that if a property holds 

for some derivation in V x (A), it will obviously hold for some derivation in X> 2 .(A). 

Parts 3 and 4 are immediate consequences of the definitions. I 

An important property of irrelevance claims is whether they are closed under the 
union. of their subjects. This is important when a system needs to determine whether 
it can use all the irrelevance claims it has, or whether using certain ones will falsify 
Others. 

Observation 2.8: Closure under union: Weak irrelevance claims are not closed 
under the union of their subjects in general. In contrast, for strong irrelevance claims 
(and combinations of Strong and weak irrelevance) we have a sufficient condition for 
closure. that depends only on. D/. Specifically, whenever 

/)/(*!,£>) A £>/($*, D) => DI.(4> X U D) 

holds for any derivation D and sets $t, $ 2 , then for any choice. of V 0 and 2, 

, 0, 2, DIM A $/(* 2 . t/h 2, DIM SI{* i U $ 2 , , S, £>/, Do) 

and 

S/(*t, S, DI\ Vo) A W7($2, 0, 2, DIM * Wi(*\ U '*i, ^ % Dir flb). 
However, 

W7(<&i,0, 2, DIM A U7(<t 2 , 0, 2, DI, Vo) => Wl(* x U *2,0, 2, D/,£ 0 ) 

does not hold in general. 

The reverse holds for ali definitions, i.e.Jf <ft..is .,irrel.evan.tA.Q,0 and C $, then 
4>i is irrelevant to_0. 
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Proof: Suppose 5/(4>,,0,E, DLV 0 ) A S/($ 2 . t\ S. DL V Q ) holds, and let A € S. 
arid D be a derivation in P 0 (A). Since both bl{Q x .D) and D/(<f> 2 .D) hold, also 
Z)/($i U <b 2 - D) holds. Because this holds for any D € X> 0 (A) and A € £, then 
•b /(^i U 4*2. u\ £, Dl, X^o ) holds. 

The proof for the second claim is similar. We simply consider the derivation D for 
"hich DI($ 2 ' D) holds, and Dl[§\ U $ 2 , D) will hold. The weak irrelevance claims 
for g\ and gz in Example 2.2 present a counterexample of the third implication. I 


Observation 2.9: Non-monotonicity: In order to exploit irrelevance claims, it is 
important to know whether their truth changes when the knowledge base changes. 
In general, adding new formulas to the knowledge base may cause a formula that 
was irrelevant to become relevant, or .vice versa, a formula that was. not irrelevant can 
become irrelevant. .Weak irrelevance claims can change even when .the added formulas 
are logical consequences of the knowledge base. In contrast, strong irrelevance claims, 
are more robust. Definitions of strong irrelevance claims have the property that they • 
do not change when the added knowledge is obtained by reasoning with the original 
knowledge base. 7 Specifically, if A h r and A is consistent, then 

S/.(0,0, A, D/ 2 ,I> 0 ) => $/($,0,AU r,0/j,2> o ) 

Hence a strongly irrelevant formula can not become relevant by reasoning on existing 
knowledge. 8 

Proof: Suppose that 5/($,t/», DIz^'Dq) holds and suppose in contradiction that 
D is a derivation of 0 from AUt such that 0 € D and 0 € 0. We create a derivation 
D of V from A such that t € D 1 . The only modification to D is to replace every 
appearance of r as a leaf in the proof tree by the derivation of f from A. The result 
is a derivation of 0 from A that includes 0. Consequently, S/(0,0, A, DI 2 ,V 0 ) does 
not hold. I 

As stated in our original motivations, for most- definitions of irrelevance (and in 
particular all the definitions we consider here), if 0 is irrelevant to v, it can be safely 
removed from the knowledge base: 

Observation 2.10: For any definition of irrelevance that uses a definition of deriva- 
tion irrelevance Dl such that DI{ct>, D) =* Z5/,(0, D), then A fs 0 holds if and onlv 
if (A - 0) {= 0. 


7 1 his property also seems natural for our common sense. notion of irrelevance. 

8 Assuming our inference is monotohic. _ 
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Proof: Suppose 11-7(0, v\A. DLVq) holds. This means that if A b 0. then X^of-A ) 
is not empty, and i' has some derivation I? .for which DI{o. D ) holds. The formula o 
is not a member of Basc[D ) because DI\{6,D) also holds. Therefore, the derivation 
D will also be a valid derivation from A — 0. I 

The utility of removing an irrelevant formula is a more subtle, issue. Removing 
a formula that is only Weakly irrelevant may not speed inference. In fact, explana^ 
tion based learning systems [Minton et a/., 1989] do exactly the opposite, they add 
redundant rules (which, ir>. our framework, would be considered weakly irrelevant). 
The utility of adding such rules is a subject of ongoing research (e.g. [Minton, 1988; 
Etaioni, 1990; Greiner and Jurisica, 1992; Etzioni and Minton, 1992]). 

For strong irrelevance, savings are guaranteed for many cases. For example, when 
considering all derivations of the query (i.e., P 0 = Z> v /.), if S/($,0, A, holds, 

then deriving 0 .from A — 4* costs no more than deriving it from A. .This property 
also holds if we consider a set of derivations Po(A), such that the inference engine 
is always guaranteed to find one of the derivations in Po(A) before.it finds others. 
Removing stronglv-irrelevant formulas yields savingsjrom several„sources: 

• Removing irrelevant formulas prunes whole branches of the search space. 

• Much of the cost of reasoning in a large knowledge base is in .doing database, 
lookups. Removing a large number of irrelevant ground facts at the outset will 
significantly reduce the cost of each lookup operation. 

• If updates are made to the KB that concern only. irrelevant formulas, then we 
need not recompute the answer to the query. 

• Space savings are achieved from removing the irrelevant formulas. 

2.3.4 Examples of Definitions 

In this section we describe several instances of definitions in the space. We begin by 
showing how definitions in previous work can be couched in the space. 

Other Definitions From the Literature 

Subramanian investigates .several definitions of irrelevance, which are all instances of 
weak irrelevance in our framework. The main definition investigated in [Subramanian, 
1989] is the following: 

Definition 2.11: Let 0 be a formula, 0 be a query and A be a knowledge base. - 
The formula 0 is said to be irrelevant to t/\ denoted by WI R0, 0, A) if there exists a 
subset At of A, such that At ^ 0 and Aj f= 0. I 
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This definition can be couched in our framework as follows: 

Observation 2.12: For a complete set of inference rules S , 

H7i(o, A) = U7(0,0.A.D/ 3 ,P„). 

Proof: Suppose W'7|(0, v,A) holds. Therefore, there is. some subset Ai of A such 
that Ai f= 0 and A 6 and there is some derivation D of 0 from A). Clearly, 
Base{D) ^ 0, and consequently, H'/(o. 0. A, holds. 

Conversely, assume e\ A, D/ 3 , P*) holds. Consequently, there is some 

derivation D of i/> from A such that Base(D) j£ 0. The KB consisting of Base(D) is 
a subset of A and does not entail 0. Consequently, W I j(0, 0, A) holds. I 

A variation of this definition that is described, in [Subramanian and Genesereth, . 
1987] can be formulated as 117(0, 0, A, D/,i, P,/,). Couching Subramaman's defini- 
tions in our framework highlights some of the properties of her definitions, mainly 
the fact that removing irrelevant formulas may not always lead to speeding up infer;_ 
ence. 

A definition of irrelevance. is described in [Srivastava and Ramakrishnan, 1992]. 
Their definition is equivalent to strong irrelevance when Z)/ 2 is quantified over the 
set of all derivations of the query, i.e., it is equivalent to 57(0,0, A, DI^V^). 

Several, resolution strategies are based On removing irrelevant .clauses. For ex- 
ample, for refutation resolution, clauses containing pure literals 9 can be shown to 
be strongly irrelevant (with respect to Dl\ and P^.), and can therefore be removed. 
Tautologies can be shown to ba weakly irrelevant (with respect to Dl\ and P 4 ) and . 
therefore are removed by the tautology elimination strategy [Genesereth and Nilsson, 
1987]. 

The question of detecting when a query is independent of an update is closely 
related to the notion of irrelevance. In Chapter 5, we show that definitions of in- 
dependence investigated by Elkan [Elkan, 1990] and Blakeley et al [Blakeley ef a/., 
1989] are equivalent to weak irrelevance (specifically, W7(0,0, A, This 

observation enabled us to develop new algorithms for detecting independence. 

Irrelevance with Minimal Derivations 

Interesting definitions of irrelevance are obtained by considering cases in which DI 
is required to hold only for minimal derivations, i.e., where the choice Po along 
the second .axis is the set of minimal derivations. There are many ways of defining 
minimality of derivations. Here, we consider three possible definitions. Recall that 

9 A literal is pure if and only if it has no instance that ts complementary to an instance of another 
literal in the knowledge base [Genesereth and Nilsson, 1987). 
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a derivation is a sequence ai o n , and it can be viewed as a tree formed by the 

subgoal relation. The following are three possible definitions of minimality. 

A/1: A derivation D is minimal if does not have two identical formulas, o, and 
cij such that a, is an ancestor of 

A/2: A derivation D is minimal if whenever c»i and q 2 are two identical nodes 
in the tree, their subtrees are identical, (essentially this means that if a formula 
is used 'in two places in the proof, then its derivation in both places is identical). 

A/3: A derivation D is minimal if there is ho other derivation of the query D' 
such. that Base{D') C Ba$e[D) and Bast[D‘) ^ Bas(D). 

To see the difference between the classes of derivations, consider Example 2.2 and 
assume we also had a rule 
r 5 : canTA( X, V) =* pass( A'. V). 

Figure 2.2 shows three derivations. Derivation (a) is not a member of Ml because 
pass( Fred, 202.) is a subgoal of the. query. Derivation (b) is a member of Ml but 
is not a member of A/2 because pas${ Fred, 202) is derived in two different ways. 
Finally, derivation (c) is a member of A/1 and A/2, but not a member of M3 because 
the query. 'can be derived using a subset. .of the base of the derivation (using only 
pass Exam(Fred, 2Q2) and the rules). 

Note that A/1 2 M2 but A/1 2 A/3. Interestingly, the definitions of strong 
irrelevance for Ml and. M2 turn .out to be equivalent: 

Lemma 2.13: The definitions of strong irrelevance are equivalent for Ml and M2, 
i.C.,. 

5/(0 , 0, S, Dl 2 , A/1) = 5/(0, 0, £, Dh, M2). 

Strong irrelevance for A/3 is stronger than the other two, i.e., 

5/(0, 0.S,D/t, A/3) =* $/(0,0,S,0/i,M2), 

Proof: Since A/1 2 A/2, it follows from Lemma 2,7 that 

5/(0, u\tj, Dli, Mi) => 5/(0, 0,S,D/ 2 , A/2). 

To show the converse, we show that if D is a derivation of 0 from a knowledge base 
A € £ such. that .0 € D and D.€ A/1, then there is a derivation D 1 (of 0 from A), 
such that 0 eJ2! and D' € A/.2. Consequently, 

^5/(0 ; ^ i £.D/ 2 , A/1) =>• -\5/(0,0,S, D/ 2 , A/2) 
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Figure 2.2: Minimal derivations 


and therefore 

5/(0, tp, E, D/ 2 , M2) =* 5/(0, 0, E, />/ 2 , Ml), 

Let D a derivation such that 0 € D and D € Ml. Suppose r is a formula that 
appears twice (or more) in D with non identical subtrees, Tj and T 2 . Let T be one of 
these subtrees in which 0 appears (if 0 does riot appear in either Tj or r 2 then choose 
one arbitrarily). Replace all the subtrees of r in D by T. Note, that since 0 € Ml, this 
transformation is well defined. Denote the resulting derivation by D\ The derivation 
D' is a valid derivation of the query, it includes 0 and furthermore, f has a unique 
subtree in every appearance. We repeat this transformation until we cannot find a 
formula r which appears with two (or more) non identical derivation subtrees. The 
resulting derivation will be a member of M2 and will include 0. Note that the number 
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of. transformations must be finite, because the number of transformations we perform 
is at most the number of distinct formulas in D. 

To show that strong irrelevance for A/3 entails the other two, we show the fol- 
lowing. Let D be a derivation such that 0 6 Base{D) and D € A/3. We construct 
a derivation D' such that 0 6 Base[D') and D' £ A/1 as follows. For every pair of 
identical nodes T[ , r 3 6 D such that r t is an ancestor of r 2 , we replace the subtree of 
T\ by the subtree of f 2 . 10 The resulting derivation D' is a member of A/1. Moreover, 
if 0 € Base(D) then 0 € Base[D '), otherwise, D would not be a member of A/3. As 
before, this construction shows that 

-5/(0, c\£.D/i. A/3) => -S/(d,t',S.D/,,A/.l.) 


and therefore 


S/(0,tf\S,£>/i,A/l) => 5 /( 0 , 0, E.DI U A/3). 


I 

It. should benoted that removing formulas that do not appear in minimal deriva- 
tions (of the. type A/1) will speed up inference for many search strategies, employed, 
by inference. engines. For example, an inference mechanism performing depth-first 
search or breadth first search will always find a derivation of the query that belongs 
to Ml before it finds one that does not. 

In Chapter 3, we describe an algorithm for automatically deciding which formulas 
are strongly irrelevant to a query when Considering Ml (and therefore also M2) for 
Horn rule theories. We also show that deciding which formulas are strongly irrelevant 
for A/3 is undecidable in general. However, Lemma 2.13 implies that the algorithms 
of Chapter 3 provide a sufficient condition for Strong irrelevance for M3. 


Relationship to Truth Maintenance Systems 

Strong irrelevance for M3 can be characterized in terms of labels in an assumption 
based truth maintenance system (ATMS) [de Kleer, 1986]: 


Observation 2.14: Assume a complete set of inference rules and let 0 be a formula 
and 0 be a query. 5/(0, 0, A, DI\, A/3) holds if and only if 0 does not appear in any 
ATMS label of 0. 


Proof: An ATMS label of 0 is a set of support 5 such that 5 |= 0 and such that 
there does not exist a subset S' C 5 such that S' f= 0. Clearly, 5/(0,0, A,£>/i,M31 
does not hold if and only if there is some derivation D such that Base{D) is a minima, 
support set for 0. The set Base{D) will be the ATMS label for 0. I 

l0 If there are several such pairs. ,we do bo in an arbitrary order. 
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This observation shows that even though finding all formulas that cannot appear 
in the ATMS labels of a query is undecidable, the algorithms that we present for 
deciding strong irrelevance can be used to prune formulas from consideration when 
computing ATMS labels. 

Evidence Based Definitions Revisited 

As described in the beginning of this chapter, much of the previous analysis of ir- 
relevance was done in a slightly different context. Specifically, it has addressed the 
following question: 

• Given a set of evidence £ and query q , which formulas are irrelevant to q with 
respect to the evidence? 

As stated, our analysis differs in that we want to know which part of our knowledge 
base.(that can be viewed as evidence)_is irrelevant to the query. To.reconcile the two, 
we can ask the following question 1 .— 

• Given a knowledge base A and a subset of it £ called the evidence, which parts 
of A are irrelevanLto c?,_given the evidence? 

Intuitively, the formulas in £ are. basic assumptions about the domain that we 
want to use if possible. We can define such a notion in our framework in several 
ways. One way is to limit the set of derivations considered to those in which formulas 
in £, if they appear in a derivation D, must be in Base(D). This means that we 
do not allow evidence to be derived from Other formulas. A slightly stronger way of 
formalizing th.s notion is by considering derivations of the query that have minimal 
support with respect to the evidence £, denoted by Pf, as follows: 

Definition 2.15: A derivation D is said to have minimal support w.r.t. the evidence 
£ if there does not exist a strict subset S C Base(D) such that S U £ h rp. | 

Intuitively, derivations in t>s use the formulas in the evidence as much as possible. 

Returning to our example, if the query is q = cdnTA{Fred, 101) and the set of 
evidence is empty, then the atom pas$Exam(Fred, 101) is not strongly irrelevant 
to the query. However, if our evidence includes the atom {pass(Fred, 101)}, then 
pass Exam( Fred, 101) becomes strongly irrelevant to q, 

2 A Automatically Deriving Irrelevance Claims 

A key question that we address in this dissertation is how (and whether) irrelevance 
claims can be derived automatically. Specifically, we are interested in Awo.problems. 
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First, given a knowledge base, a query and a specific definition of irrelevance, we 
want to find automatically which formulas in the knowledge base are irrelevant to the 
query. Second, given an irrelevance claim, we want- to derive other irrelevance claims 
that logically follow. We focus on solving the first problem. In Chapter 4 we show 
how results pertaining to the first problem can be used to solve the second. 

In general, deciding which formulas are irrelevant to a given query will be more 
expensive than answering the query itself, especially in large knowledge bases. Fur- 
thermore, if the knowledge base changes, the relevance reasoning needs to be repeated. 
In order for our algorithms to be of practical interest, we will derive irrelevance claims 
by examining only a small and stable part of the KB, and will derive irrelevance claims 
that hold independent of any changes that are made to other unexamined parts. Fur- 
thermore, our irrelevance claims will hold for. a family of queries (given by a query 
schema). 

We examine the question. of automatically deriving irrelevance claims for Horn 
knowledge bases that consist of a set of Horn rules -? 5 and a set of ground atomic 
facts G. We. distinguish two sets of predicates in the KB: base predicates (often called 
EDB predicates) and derived predicates (IDB predicates). The base predicates are 
those .that appear in .the ground facts of G. The derived predicates are those that 
appear in the. consequents of the rules. For syntactic, convenience,, we assume that, 
base predicates do not appear in the consequents of rules. The KB consisting of Given 
a set. of rules V and ground facts G, can also be .viewed as defining relations for the 
derived predicates in terms of the base predicates. 

Many of the interactions between rules in a knowledge base can be deduced by 
considering the semantics of some of the predicates that appear in them, such as order 
predicates (=, <, <, >, >) or sort predicates. For instance, in Example 2.2, <73 

and. <74 were deemed strongly irrelevant by considering the semantics of the predicate 
<. We therefore distinguish a subset of the predicates which we name constraint 
predicates (or interpreted predicates). These predicates will be treated much like 
the EDB predicates, with the difference being that their semantics will be enforced 
in our relevance reasoning. A constraint formula is a formula in some language 
C for expressing constraints that involves only literals of constraint predicates and 
logical connectives (e.g., disjunction, conjunction, negation). For example the formula 
Even(x) A (x > 100) is a constraint if the predicate Even is a sort predicate. We 
place few restrictions on the properties that the constraint predicates need to satisfy. 
Formally, a formula / (in the language £), with free variables A'n . , . , Xn> describes 
a (possibly infinite) relation R/(X\,., . , A' n ), which is the set of all tuples satisfying 
the constraints expressed by /. We assume the following properties of constraint 
formulas: 

Closure: Given formulas f\ and /j, it is possible to effectively construct formulas 
that express: 
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• The join of Rj l and R/ 2 . 

• A projection of Rj i (i.e.. a relation consisting of only a subset of the argu-* 
ments of Rf t ). 

• A selection <r t=J R / i . where i and j are some columns of /?/, (i.e.. a relation 
consisting of only tuples in which columns i and j are equal). 

• A selection a,= c /?/,, where i is some column of /?/, and c is a Constant in 
the language C. 

Equivalence: Given formulas,/] and ./ 2 . it is decidable whether Rj { = R/ 2 . 

Satisfiability: Given a formula /, it is decidable whether Rj is nonempty. 11 

Finiteness: Let C be a finite set of constants in the language £, and let !F be a. 
finite set of formulas in the language C that have at most n free variables (for. 
some fixed, n) and only constants from C. Then, applications of the operators 
(discussed in the Closure Property) to formulas, in J 7 - may create only a finite 
number of nonequivalent formulas over n (or fewer) free variables. 

Moreover, if / is a formula with a free variable A', then / can imply X = e, 
w.here c is a constant of the language £, Only if c appears in /. 

The Closure condition guarantees, that we can perform the basic manipulations 
of the constraints. within .our constraint language. The second an third conditions, 
guarantee that we can identify two equivalent constraints. The Finiteness constraint 
guarantees that we only have a finite number of non isomorphic constraints. In 
Chapter 3, we discuss the case in which the Finiteness condition does not hold. The 
procedures needed to compute the closure operations, equivalence and satisfiability 
are assumed to be given. 12 

These conditions ‘cover a wide class of interpreted constraints. The following are 
a few examples: 

• Order constraints: The language consisting of the predicates =, <, <, 

>, > and the connectives A and V. If we allow only conj ructions, the Closure 
condition will not be satisfied. This special case is treated in Section 3.2.1. 

• Sort constraints: A constraint language based on a finite sort hierarchy, and 
the connectives A, V and 

11 Note that if we have a formula FALSE in our language, denoting the empty relation, then the 
Satisfiability Property will follow from the Equivalence Prbperty. 

12 Typically, these procedures are efficient. For example, for order constraints, testing equivalence 
is cubic in the number of variables. 
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• Finite, given relation: Often, a given relation that is relatively small and 
stable Can be best viewed as a constraint. Any given finite relation satisfies the 
properties that we require. 

Hereafter, a constraint will refer to a constraint formula in some constraint lan- 
guage C. 

Finally, we also consider cases in which the rules contain negated literals in their 
antecedents (and are therefore no longer Horn). In such cases, we assume: . 

• The negation is stratified [Ullman, 1989]. 13 

• The negation is safe, he,, if a variable appears in a negative literal in the an- 
tecedent then it also appears in a positive literal in the antecedent. . 

We consider queries that are atoms which are either. ground, (i.e., is p(a) entailed 
by V U G?.), or contain free variables (i.e., find some, or all, x such that p( X) is 
entailed by V U G). - 

In many applications .using Horn. rule knowledge bases, it is the case that the. bulk 
of the KB is ground facts, and the ground facts are much more prone to frequent — 
changes than the remainder of the KB. Therefore, we address the irrelevance problem 
for the set of knowledge bases that differ only on ground facts. Specifically, we 
address the following question. Let V be a set of rules, and let.Sp be the set of. 
knowledge bases of the form VGG, where G is a set of ground atomic facts for the 
EDB predicates. The question then is whether we can decide whether a given fact <f> 
is irrelevant to the query t/), i.e., does SI(4>, t/>, E?>, D/j,X>) or W I (4>, if, tip, DI<i,V) 
hold. Note that in Horn rule KBs, DI\ and DI 2 are equivalent for the rules and the 
EDB formulas. For IDB formulas, DIi . is trivially true. Therefore, we consider the 
definition DI 2 in our investigations. 

A summary of the decidability results pertaining to this question is shown in 
Table 2.1. . As we prove below, weak irrelevance is undecidable whenever the rules 
contain recursion. In contrast, strong irrelevance is efficiently decidable for a larger 
class of languages. In Chapter 3 we present algorithms for deriving irrelevance for 
several cases of strong-irrelevance, including irrelevance under minimal derivations. 
Chapter 5 describes an algorithm for detecting weak irrelevance in the presence of 
constraints. In the next section we prove a few undecidable cases of irrelevance. 

The complexity of the algorithms we describe are all linear in the number of rules 
in the KB and do not depend on the number of ground facts. The complexity is 
exponential in the arity of predicates. When we consider irrelevance under minimal 
derivations, the algorithms are doubly exponential in the arity of the predicates. 

l3 The rules are stratified if there are no dependency cycles that involve negations between the 
predicates in the KB. The dependency graph of the KB has one node for every predicate and there 
is an arc from p to .q if p appears in the antecedent of a rule whose consequent is q. 
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However, arities of predicates tend to be very small (e.g.. frame systems usually 
employ mostly binary predicates). Furthermore, we believe that exponential running 
time is not likely to occur in practice (since finding examples with exponential running 
time requires careful crafting of the rules), 14 and so, the algorithms we present will 
be efficient in practice. 


Language 

Strong Irrelevance 

Weak Irrelevance 

IHUH 

All Minimal 

Derivations Derivations 

Minimal Support 
Derivations 



Decidable 

Follows from [Kifer. 1988] 

Decidable 

Follows from [Sagiv, 1988] 

No recursion + 
constraints 

Decidable 

Follows from Chapter 3 

Decidable 

Chapter 5 

Datalog 

Decidable 

Chapter 3 

Undecidable 

Lemma 2.17 | Lemma 2.16 

Datalog with . 
constraints 

Decidable 
Chapter 3 

Undecidable 

Lemma 2.17 | Lemma 2.16 


Undecidable Follows from [Abiteboul and Hull, 19881. 

Datalog with 
Stratified Negation 

L 

Lemma 2.18 

fndecidable- 

Lemma2.17 | Lemma 2.16 

Negated base 
predicates 

Decidable 
Section 3.4 

Undecidable 

Lemma 2.17 1 Lemma 2.16 

Unary base 
predicates 

Decidable 

[Levy et al.. 1993) 


Table 2JL_Decidabilitv of deriving irrelevance claims 


2.4.1 A Few Undecidable Cases 

The following shows that weak irrelevance is undecidable even for function-free Horn 
rules (i.e., datalog): 

Lemma 2.16: Let V be a set of datalog rules and 4- be a query. Determining whether 
lT/(0, D/ 2 , V$) holds is undecidable even if the rules have no interpreted pred- 
icates. 

Proof: Let r 6 V and t/’ be a query. We prove the lemma by showing that the 
claim WI(r,4 »,Sp, D/ 2 , £>,/,) holds if arid only if r is redundant, i.e., the set of rules 
V -r \s equivalent to the set V. In proof, suppose that W Sp, DI^T)*) holds, 
then for any knowledge base. A 6'S?, \Vl{r,4\ A, DI^V*) holds. Therefore, if there 
is a derivation of r/' , then there is one. that does not use r. Consequently, f can be 

M Specifically, it requires rules that create in their consequents permutations of the variables from 
their antecedents 
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removed from V without changing the answer to u\ regardless of the ground facts 
in the I\B, and therefore, r is redundant. Conversely, if r is redundant, that means 
that for every A € i)p, if r is provable, there is a derivation that doesn't contain r. 
Therefore, Wl(r, rl\ Ep, Z?/ 2 , XV) holds. 

However, it follows from [Shmueli, 1987] that redundancy, is undecidable for dat- 
ak>g theories. Therefore, weak irrelevance is undecidable. | . 

It should be noted that Subramanian [Subramanian, 1989] states a similar result 
for Z)/ 3 , but does not give a proof. The following lemma shows that strong irrelevance 
under minimal-support derivations is also undecidable: 

Lemma 2.17: Determining whether Slicp.i'.E-p, M3) holds is undecidable for 
datalog knowledge bases without interpreted predicates. . _ 

Proof: We prove the. lemma by reducing the containment problem of datalog pro- 
grams to .the strong irrelevance problem for M3. Since it.follows from [Shmueli, 1987] 
that containment is undecidable, strong irrelevance for M3 is also, undecidable. . 

Let Pi and P 2 be two datalog programs. .Let e be a new EDB predicate appearing 
nowhere in P\ or P 2 . We construct a program P 3 as follows: 

Pi(A')Ae(A')=*p 3 (A) 

MX) * MX) 

We show that S/(e(A'),p 3 (A'), Sp 3 , £)/ 2 , M3) holds if and only .if Pi C P 2 . Suppose . 
Pi C P 2 holds, We- show that for 'any given constant a, e(a) cannot be part of a 
minimal-support derivation of p 3 (a). Suppose G is a database from which e(a) is part 
of a minimal support derivation D of p 3 (a). We can assume that G contains Only 
the ground atoms in Base{D). The database G — e(a) is therefore enough to derive 
Pi (a). However, since Pi C P 2 , the database G — e(a) is also enough for deriving 
p 2 (a), and therefore, p 3 (a). However, this would mean that D is not a minimal 
support derivation because the derivation of p 3 (a) through p 2 (a) uses a strict subset 
of Base(D). 

Conversely, suppose Pi %. P 2 . Therefore, there is a database G and a constant a 
such that pi(a) € P\{G ) and p 2 (a) 0 Pi{G). Consider the database GU e(a), and let 
D be a minimaTsuppOrt derivation of pi(a). The formula e(a) will now be part ■. f 
a minimal support derivation of p 3 (a), constructed from D and e 3 (a), using the first 
rule. Consequently, £/(e(A'),p 3 (A'), Sp 3 , jD/ 2 , M3) does riot hold. | 

Finally, we -show* that strong irrelevance is undecidable when we allow the rules 
to have stratified negation. In our discussion, we aSsume-perfect model semantics of 
the rules (cf. [Ullman, 1989]). ! ® 

15 The perfect model of a set of rules is the one computed in a bottom-up fashion ,_stratum_by 
stratum. 
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Lemiiia 2.18: Let V be a set of datalog rules with stratified negation and r 6 V. De- 
termining whether $ I(r. t\ E-p, DI 2 , X\-) is undecidable, even, if "P ha$ no interpreted 
predicates. 

Proof: testing equivalence of two datalog programs is undecidable [Shmueli, 1987]. 
We will reduce the equivalence problem to the irrelevance problem of a rule in a 
stratified program, i.e., we show that if there is an algorithm for deciding. whether a 
rule r in a datalog knowledge base V is strongly irrelevant, then we can design an 
algorithm for testing equivalence of two programs. 

Let V\ arid P 2 be two programs with query-predicates pi and p 2 . 16 Without loss 
of generality we can assume that T x and V 2 have distinct sets of IDB predicates. To. 
test equivalence, it is enough to test whether V\ D V 2 and V 2 D V\. Let Q be the 
program containing the rules of V x and. V 2 and the rule 

r : pi(.Y) A -'p 2 (A") =*■ <?(.Y). 

where q is the query predicate of Q and it appears nowhere in P x or P 2 . Note that 
Q a stratified program, since r .is the only rule containing negation. Clearly, r is 
strongly irrelevant to q if and only - if 'P 2 P x , since r will be used in a derivation 
if and only if there is some database in which Some ground tuple is a. member of.pi 
and not. of p 2 . In a similar fashion, we can create a program with a rule r' which 
will be strongly irrelevant if and only if V x 2 7 > 2 . Consequently, if .rule irrelevance 
is decidable for programs with stratified negation, then program equivalence will.be 
decidable. I 

Chapter 3 and [Levy et a/., 1993] describe restrictions on stratified, negation in 
which strong irrelevance is still decidable. 


2.5 Summary and Related Work 

Ve have presented a general framework for analyzing and comparing definitions of. 
irrelevance. The framework is based On a proof-theoretic analysis of the notion of 
irrelevance, and therefore enables us. to address the two key issues in relevance rea- 
soning: automatically deriving irrelevance claims and the utility of removing irrele- 1 
vant formulas. Aside from suggesting new definitions of irrelevance, the framework 
encompasses previous definitions that were discussed in the literature. For example, 
as will be discussed in Chapter 0 , the framework sheds new light on the problem 
of detecting independence of queries from updates. Within .the framework, We have 
identified a class of irrelevance claims, namely strong irrelevance, which hav.e.seVeral 

l6 Recal! that the programs V\ and V 2 are equivalent if for any database, the relation computed • 
for pi is the same as that computed for p 2 . 
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desirable properties. First, removing strongly irrelevant formulas is guaranteed never 
to slow inference (and usually speed it up significantly). In Chapter 4 we will present 
experimental results to validate the impact of these speedups. Second, We demon- 
strate in Chapter 3 that for some languages, it is possible to. efficiently decide which 
formulas are strongly irrelevant to a given query. Finally, strong irrelevance satisfies 
several properties that have been argued to be natural for the common sense notion 
of irrelevance (such as closure under union and some forms of monotonicitv). 

The notion of irrelevance has been formally investigated in the philosophy litera- 
ture [Keynes, 1921; Carnap, 1950; Gardenfors, 1978]. As stated earlier, the focus of 
the discussion there was on formalizing a notion of irrelevance that would fit common 
sense notions of the word. The discussion did not concern itself with the computa- 
tional aspects of reasoning about irrelevance, as we focus On here. Moreover, the focus 
of the discussion. in .that literature.is on analyzing irrelevance w.r.t. a set of evidence, 
which is usually, treated as a closed theory (i'.e., independent of changes in .the form 
of the formulas or the inference mechanism). In our analysis, we are Concerned with 
finding irrelevant formulas in a large KB, .where the form of the KB and the inference 
mechanism play key roles. 

A related concept discussed in the formal logic community is of relevance logics 
(e.g„ (Anderson and Belnap, 1975; Dunn, 1986; Avron, 1992]). The key idea in 
relevance logics is to modify the logic and the. inference rules such that only relevant 
implications can .be made. However, two issues are still largely open in this field. 
Tlie -first is -devising clean and. intuitive semantics for these logics, and the second 
is providing tractable inference for them. In contrast, our analysis of irrelevance 
assumes that the underlying logic remains unchanged. 

Within AI, the notion of irrelevance was used rather informally in various works, 
such as RLL [Greiner, 1980] and compositional modeling [Falkenhainer and Forbus, 
1991]. Irrelevance was investigated extensively in the context of probabilistic reason- 
ing [Pearl, 1988]. However, in that context, irrelevance has a natural definition based 
on the notion of conditional independence. This notion does not Carry over to the 
context of logical knowledge bases. 

The work most related to ours is the analysis of irrelevance given by Subra- 
manian [Subramanian and Genesereth, 1987; Subramanian, 1989]. Subrafnanian's 
motivations for analyzing irrelevance are similar to ours, namely, reformulating the 
knowledge base to create one that is simpler and will therefore lead to more efficient 
inference. However, her framework does not provide sufficient distinctions to enable 
one to analyze the issues of deriving irrelevance claims. and the utility of doing so. 
Our framework can be viewed as a refinement of hers, where in. addition to consid- 
ering the fofm of the KB, we-also consider the possible derivations that an inference 
mechanism can pursue. -The specific definitions that She considers are formulated in 
our framework as variations of Weak irrelevance. Subramanian also defined a class of 
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computational-irrelevance claims whose exploitation leads to computational savings, 
but Only gave some straightforward examples of such claims. Our class of strong 
irrelevance claims is a prime example of computationabifrelevance claims. 

It should be noted that in (Subfarhanian and Genesereth, 19S7], a definition 
of strong-irrelevance is given. However, instances satisfying this definition are not 
necessarily instances of computational irrelevance. For instance, under her defi- 
nition. in Example 2.2, the atoms g\-g* are also strongly irrelevant to the query 
ciifiTA{ Fred, 101). Finally, Subramanian discusses several algorithms for detecting 
irrelevance. However, they focus on the case of propositional logic KBs and require 
solving the query as part of the algorithm. Consequently, their utility is questionable. 
She considers an extension of the algorithm to the first order case, using the concept of 
a definability graph. This graph denotes only .the dependencies between predicates in 
the KB, and therefore does not enable relevance reasoning beyond simple reachability 
tests. 


Chapter 3 

The Query-tree 


In the previous chapter we posed the. problem of automatically deriving irrelevance 
claims. This chapter describes algorithms for automatically deriving strong irrele- 
vance claims. Recall that a formula is strongly irrelevant to a query if some condition 
( Dl ) holds for all the derivations in some set “Dq of derivations of the query. There- 
fore, in order to deem a formula strongly irrelevant, we need to meet two challenges. 
The first is to establish properties of a possibly. infinite, set of derivations by a finite 
procedure. The second is that even if there is only a finite number of derivations, an 
algorithm that, actually enumerates all of them will be of little. interest, both theo- 
retically and in practice. Therefore, we would like an efficient method .of establishing 
properties of a set of derivations without actually enumerating them. - 

This chapter describes a novel tool, the query-tret , (see example in Figure 3.2) 
that is used to establish efficiently the properties of a set of derivations. The query- 
tree is a data structure that encodes a (possibly infinite) set of derivations so that 
properties of that set can be established by inspecting the tree. For example, by 
inspecting the query-tree we can check whether a certain formula can be part of some 
derivation of the query, and therefore decide whether it is strongly irrelevant to the 
query. The query-tree is a general method for encoding a given set of derivations. 
Query-trees differ depending on which set of derivations we want the tree to encode. 
The challenge in building a query-tree is to ensure that it encodes ail and only the 
derivations in which we are inter:. -ted, When it does, inspecting the query-tree is 
akin to inspecting the entire set of derivations. 

We begin in Section 3.1 by describing the principles underlying the query-tree 
method. We present a general method for building a query-tree that encodes a 
desired set of derivations. In the subsequent sections describe several instances of 
the query-tree, obtained by following the general method. Section 3.2 considers the 
problem of building a query-tree for function-free Horn-rule knowledge bases with in- 
terpreted predicates (in this chapter we assume that the interpreted predicates satisfy 
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the conditions given in Section 2.4). We show how to build a query-tree that encodes 
precisely the set of possible derivations of the query. It also discusses extending the 
algorithm to the case where rules may have function symbols. In Section 3.3 we de- 
scribe how to build a query-tree that encodes only the set of minimal derivations of 
the query. Section 3.4 considers an extension beyond Horn rule knowledge bases. We 
show how to build a query-tree that encodes precisely the set of derivations of the 
query when the rules have negated EDB literals in the antecedent. 

3.1 The Query-tree Method 

3.1.1 Symbolic Derivations 

In the context of Horn rule knowledge bases, We view a derivation as a tree consisting 
of goal-nodes and rule-nodes, (see Figure 3.1(a)).. The root of the tree is a goal-node 
containing the query atom. If a goal-node g was. derived using a rule r and the 
antecedents < 7 i, . . . ,<? m , then r is. the child of p.and its children are g x.,., . . , The 
leaves of a derivation are ground atomic facts from .the database. 

Since the query-tree will be built based only on the rules in the knowledge base 
(without looking at the. ground atomic formulas), it will encode a set of derivations 
by encoding a. set of symbolic derivations (see Figure 3.1(b)). Like a derivation, 
the root of a symbolic derivation tree is a goal-node of the query atom (which does— 
not have to be ground). -The child of a goal-node is a .rule-node containing a rule 
whose consequent unifies with the goal-node. The rule-node has a goal-node child 
for every conjunct in its antecedent, and the contents of each such goal-node is the 
corresponding conjunct in the unification of the rule with g. The leaves of a symbolic 
derivation tree contain atoms of EDB predicates or atoms of interpreted predicates. 

A symbolic derivation tree contains only variables and constants that appear in the 
rules. If a rule-node r' in a symbolic derivation tree contains the rule r from the 
knowledge base, we Say that r' ir a rule of r. Similarly, if g is a goal-node containing 
an atom of the predicate p, we say that g is a node of p. .We assume that the variable 
patterns in a symbolic derivation tree implicitly, represent all the equalities implied 
by the interpreted constraints, i.e.. if the conjunction of the interpreted constraints In 
the rules imply that two variables .Y and Y ntust be equal, 1 then the same variable 
appears in ail the positions of .V and V. 

A symbolic derivation represents the set of derivations that can be obtained by 
assigning constants to the variables in the derivation. Therefore, a set of symbolic 
derivations represents the union of derivations represented by each element of the.set. 

In order to build a query-tree that enables us to establish some property of a set 

1 For example, one rule contains the literal X < Y arid the other contains X > Y. 
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good Point (Y ) 

1 

tm 

{100 < X < 200} 1 

{ 150 < Y < 170} 

link{ 140, 160) 
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link{X, Y) 
1 


1 

>■< 

1 


sf4jj(l40. 160) 

| 

{.Y < Y) step(X,Y) 


(a) 

(b) 



goodPath[X. Y) 

\ 

badPoint(X). path(X, Y) goodPoint(Y) 

{100 < .V < 200} J. { 150 < Y < 170} 

/in*(k\V) 

I 

T5 

{.Y < 100 A >• > 200} bigStep(X, Y) 

(c) 

Figure 3.1: (a) is a ground derivation, (b) is a satisfiable symbolic derivation and (c) 
is an unsatisfiable symbolic derivation. 


of derivations P, we first identify a set of symbolic derivations fl , such that encoding 
the set II will enable us to deduce the properties we need about P. For example, if 
we are building a query-tree to encode all derivations of the query when interpreted 
predicates may exist in the rule's, the set 11 will be the symbolic derivations with 
the property that the interpreted constraints on the variables in the derivation are- 
satisfiable. We denote this set by n, a( , For example, in Figure 3. 1 5 considering 
the semantics of the order predicates implies that derivation (b) is satisfiablej while 
derivation (c) is not satisfiable. Given such a set FI, our goal will be to build a 
query-tree that encodes precisely the symbolic derivations in H. In our discussion, 
we use 11 both to denote a set of symbolic derivations or to denote the property that 
distinguishes symbolic derivations in the set, 
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In this chapter, we consider properties n on symbolic derivations that can be 
recognized by finite labeling. Formally, this means the following: 

• There is a finite number of labels (of finite size) that can be attached to nodes 
of symbolic derivation trees. The number of such labels. depends only on the 
size of the knowledge base band not oii the size of the symbolic derivation). 

• The label of a node is computable from the labels of its children (or vice versa). 

• Whether a symbolic derivation tree d satisfies property II can be computed from 
the labels of d. Specifically, we distinguish one label ealled the inconsistent label. 
It should be the case that a symbolic derivation tree d has the property n if 
none of its nodes has the inconsistent label. _ 

Essentially, the finite labeling. condition .means that the set of symbolic derivation 
trees in FI can be recognized by a finite -tree automaton. 2 ' The query-tree can be 
viewed as a recognizer. for .these symbolic derivations. .The first condition, assures 
that. the number of states in the. automaton is finite -and therefore that we will be 
able to identify n using a finite structure. The second condition guarantees that we 
can specify the transitions of the automaton. Specifically, this means that given an 
input symbol and the current state, the. next state can be determined by inspecting 
the current state alont and not by inspecting the entire path that led .to the durrent 
State. Finally, the third condition guarantees that examining the labels is .indeed 
sufficient to recognize symbolic derivations that satisfy II. We assume that the labels 
completely specify the equality relations on the variables in the node, that are entailed 
bv the interpreted constraints in the rules. 

Returning to our example, to encode the set Oi symbolic derivations n, a( , the 
label of a node will be a .constraint-label describing the constraints that the instances 
of that node must satisfy. Note that because v.e require the constraint language to 
satisfy the Finitertess Property (Section 2.4), the number of nomequivalent labels will 
be finite. A Symbolic derivation will be satisfiable if and only if the constraint labels 
of the nodes are satisfiable. 

3.1.2 Building A Query-tree _ 

The query-tree is a symbolic AND-OR tree (a.k.a. rule-goal tree). It consists of goah 
nodes and. rule-nodes (see Figure 3.2). The foot of the tree is a goahnode containing 
the atom of the query. Each child of a goal-node containing g is a rule-node, containing 
a rule from the knowledge base, whose consequent unifies with g. The rule-node has 
a goal-node. child for every conjunct in its antecedent, and the contents of each such 

2 See [Slutzki, 19$5] for an exposition of tree automata. 
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goal-node is the corresponding conjunct in the unification of the rule with g. Unlikd 
symbolic? derivation trees, a rule-node in the query-tree will not have a child for an 
antecedent of an interpreted predicate. 3 


The knowledge base 1 consists of the following rules; 

ri : badPoint{X ) A path[X . Y) A goadPoint(Y) => goodPath{X , V). 

V 2 ; link(X,Y) => path[X,)'). 

r 3 : link[X . Z) A path{Z . V) => path[X. V). 

r 4 : step{X,Y) => link(X . V). 

r 5 : bigStep(X , V) => link(X,Y ). 

The following constraints are given on the ground facts: 

badPoiht(X) => 1 00 < A- < 200. . 
sfep(A'.V) => X < V. 
goodPoint[X) => 150 < A' < 170. 
bigStep{X . V) => A' < 100 A V > 200. 


goodPgtb(X.Y) {100 < .Y < Y < 170. Y > 150} 



bad Point [X) - 
{ l od < X < 170}, 


path[X,Y) 

{100 < X < Y < 170, Y> 150} 


- goddPoint(Y ) 
{150 < Y < 170} 


r 2 r 3 

{100 < A' < Y < 170, V >.150} i 

Hnlt(Y Y\ link[X t Z) pa<Vi(Z,n 

, ' 100< A !< z < 17 °} {100 < Z < Y < 170. V > 150} 

r < »*< 

step( A*. V) step( 1 y,Z) {100 < A* < Z < 170} . 

{100 < X < Y < 170, Y > 150} 

Figure 3.2: An example query-tree, Note that the rule r 5 is -not expanded because 
it would result in an inconsistent constraint label, The expanded equivalent of the 
node path(Z , Y) is path{X , V'). 


There are two key issues in building a query-tree. First, if a knowledge base has 
recursive rules, a simple minded top-down construction of the query-tree will not ter- 
minate. Second, we want to guarantee that the query-tree encodes precisely the set of 
symbolic derivations that satisfy the property FI. Therefore, we heed some principled 
method for terminating the construction of the tree by not expanding sortie of the 
nodes. We do this by attaching labels to. nodes. in the tree (ultimately, these will be 


3 Thc constraints implied by the interpreted predicates will be reflected in the labels of the nodes-, 
described shortly. 
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the same labels, we use in showing that 11 can be recognized by a finite-labeling). The 
labels partition the possible goal-nodes that can appear in the tree into equivalence 
classes. Two nodes are considered equivalent if there is an isomorphism between 
therti and between their labels (where the isomorphism is defined by a mapping on. 
the variables of the nodes). Based on the labels, we use the following termination 
condition. A goal-node g will not be further expanded if: 

• g is a node of an EDB predicate, or 

• There are no rules that can be unified with g . or 

• Expanding the node g with a rule r will result in a child, node with the incon- 
sistent label, or 

• There is some other goal-node in the tree g\ such that g and g\ are equivalent 
and sue that, has already been expanded. 4 5 We refer to g\ as the expanded 
equivalent .of g< denoted by Eq[g). 

Intuitively, there is. no need to expand both g and g\ because the labeling .scheme 
guarantees that, the subtrees that would appear under the g are. precisely the ones that, 
would appear under g\. If a node g has- an inconsistent label, it means that this node 
cannot .appear in. derivations that satisfy n. For-example. in Figure 3.2. expanding 
the rule rs will result. in. an inconsistent (i.e., unsatisfiable) constraint label. 

In order, to complete, the specification .of an algorithm for creating a query-tree, 
we need some method for assigning labels to nodes in the tree. Of course, the .method 
must guarantee that the resulting querv-tree-encodes the desired set of derivations. 
The specific methods are described in the subsequent sections. Each method specifies 
three components: 

1. .An initial label c 0 for.the.root of the query-tree. 

2. A function T DLabd(r, c, <?) that accepts a goal-node g with label c and a rule 
r that unifies with g and returns a label for the resulting rule-node child of g, 

3. A function T Dpfdj(r,0,c<g) that accepts a label c for a rule-node containing 
a rule r that was unified with its father using a unifier 6 , and a literal g in the 
antecedent of r, and returns a label for the goal-node child corresponding to g. 

Given these functions, a query-tree can be built in two steps. In the first, the tree 
is expanded in a top-down fashion, using the above termination condition and the 
labeling procedure. In the second step, we shake the tree by removing all the nodes 
that are not reachable from the base predicates and from the root.® The details of 
the two steps are shown in Figures 3.3 and 3.4. 

4 Note that can be any node in the tree, hot necessarily an ancestor of g. 

5 This step is needed because if a no'de in the query-tree is not reachable from the EDB leaves, it 
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procedure build-treeCP, q. Co) 
begin 

/* Creating a query-tree T for the rules V and query q. */ 

/* The label of a node n in the query-tree is c/(n). ♦/ 

The root of T is q with the label r 0 . 

repeat 

Let g be a node of an IDB predicate in T with label c/(g). 
if there is a node g x in T such that g = g i and cj(g x ) = cj[g) then - 
Set Eq(g) = g x . 
else 

for each rule r € P do 

if rule r unifies with g then 
6 = the most general unifier of r and. g. 
c — TDLabel[r,cj[g),g) 
if c is not inconsistent then 

Create axhild rule-node of g, containing the rule r, with label c. 
for eve;v non interpreted literal n in the antecedent of r. 

Create. a child, nd for the rule-node with label TDprdj{r.6 } c,n). 
until no changes are made to T. 
return T.. 
end build-tree, .. 


Figure 3.3: Top down creation of the quer y-tree 

Encoding Symbolic Derivations in the Query-tree 

As stated, the query-tree encodes a set of symbolic derivations. Intuitively, a symbolic 
derivation is encoded in the query-tree if it can be constructed by choosing one rule- 
node for every goal-node. In doing so, we can expand ar. unexpanded goal-node with . 
the children of its expanded equivalent. Formally, encoding is defined as follows: 

Definition 3.1: A symbolic derivation d is encoded by the query-tree T if there 
exists a mapping xl> from the nodes of d, that, do not have interpreted predicates, to 
the nodes of T that satisfies the following conditions: 

EO. If 0t, . , g n are the children goal-nodes of f in d , then 0(gi), . . . , 4>{g n ) are the 
children of y‘(r) in T. 

El. For every rule-node r 6 d. the tule in V'( r ) is the same, as the rule in.r. . 


will not be part of a symbolic derivation, since the leaves of every symbolic derivation need to be of 
EDB predicates. 
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procedure shake-tree(r) 
begin 

/* Step 1: Marking reachability from the leaves */ 

Mark all EDB nodes in T as accessible; 

repeat 

if all children of a rule-node r are accessible then mark r as accessible; 
if at least one child of a goal-node g is accessible then mark g as accessible; 
if £<?(£) - ffi and g x is accessible then mark g as accessible; 
until no new nodes are marked; 

/* Step 2: Marking reachability from the root */ 

if 9 is a root of T and is accessible then mark it as relevant ; 

repeat 

if 9 is a relevant goal-node, r is a child rule-node of g. and 
all children of r are accessible 
then mark r and its children a s. relevant; 
if a goal-node g is relevant and either g = Eq{g x ) or g x = Eq[g) 
then mark g x a s. relevant; 
until no new nodes are marked; 

Remove all nodes that are not marked relevant. 

/* V a tliire ,s a node 9 which is.marked relevant, but its father rule-node is not marked 
relevant, then there must be some other node g x in that tree such that the father of g x is 
marked relevant, and either Eq(g x ) = g or Eq(g) ± g x . Let $ be the. isomorphism between g 
and g x and let T x be the subtree of g. Make T x 9 the subtree of g x , and remove T x from 
the query-tree. */ 
end shake-tree. 

Figure 3.4: Shaking the query-tree 

E2. The node il'(root(d)) is a foot in the query-tree. 

B3, If r. is a. child oi.the goal- node g in d then; 

1. If C{g) is expanded in T. then y[t) is a child of 0{g). 

2. Ifte(£) 's not expanded In T , then ti'(r) is a child of its expanded equivalent, 

I 

Note that if d is v encoded by the query-tree then for every goabnode. n $ d, the 
node t'(tt) is a variaole renaming of n. 
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Finally, given a labeling scheme, we need to show that the symbolic derivation 
trees encoded by the query-tree are exactly those that satisfy IT. In doing so. we will 
be aided by the correspondence between the labels of the tree and the labels given by 
the finite labeling scheme. This correspondence is captured by the label-preserving 
property: 

Definition 3.2: Let n be a property of symbolic derivations that can be identified by 
a finite labeling scheme that assigns a label L{n) to a node n in a symbolic derivation 
tree. Let T be a query-tree in which the label of a node n is denoted by TL(n). 
The query-tree T is label-preserving w.r.t the labeling scheme L , if for any symbolic 
derivation d that is encoded by 7\ the equation d>{L{n)) = T L(^(n)) holds, where c 
is the node-mapping from d to T , and o is the variable renaming from n to u’(n). I 

In words, the query-tree is label preserving if the mapping of the nodes also 
preserves the labels 

The Method: Summary 

The general method for building a query-tree can be summarized, by the following 
steps. To establish properties of a set of derivations V , we do the following: 

• Define a property II of symbolic derivations, such that we can. establish the 
desired properties of V bv inspecting nodes in the Svmbolic derivations satisfying 
IT 

• Find a finite labeling scheme for fl. 

• Describe a method. for assigning labels to nodes in the query^tree. 

• Show that the resulting query-tree encodes exactly the symbolic derivations 
that satisfy IT 

In the subsequent sections we describe several instances of this general method. 
Moreover, the general method provides a useful conceptual framework in which we 
can devise new labeling schemes fot encoding sets of derivations. 

Complexity 

The time taken to .build the query-tree (and therefore of deciding strong irrelevance) 
is dominated by the number of nodes in the tree. The other costs are those of checking 
whether two nodes are equivalent and of creating labels, both of which are polynomial 
in the size of a node. We observe that the number of internal nodes in the tree is 
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bounded bv the number of possible non-isomorphic labels /, and therefore, the size of 
t he tree can be at most lb, where 6 is the maximum number of literals in an antecedent 
of a rule. In the cases we consider, the number of labels depends only on the arity of 
predicates in the KB (.and may be exponential in that number). It does not depend 
on the number of rules in the KB (and. of course, does not depend on the number 
of ground facts!). This is an important distinction because arities of predicates, tend 
to be small (e.g.. frame svsterns employ mostly binary predicates), and therefore, the 
algorithms will scale up to knowledge bases with many rules and ground facts. 


3.2 Horn Rules With Interpreted Predicates 

In this section, we consider the problem of building a query-tree that encodes the 
set of derivations of a query from a set of Horn rules T that may have interpreted 
predicates from a constraint language C. Building such a query-tree will provide an 
algorithm for deciding strong irrelevance for. the case where P 0 = 2? v . i.e.. deciding 

5/(o. v.£r. 0/ 2 . ZU 

Our first step is to define the-set of symbolic derivations Il JQt that will be encoded 
by the query-tree. As explained earlier, these are the symbolic derivations in which the 
interpreted predicates -on the variables are satisfiable. Formally, let d be a symbolic 

derivation that includes the rule-nodes r l r m , and let -c, be the conjunction of the 

literals with interpreted predicates that are children of r,. Let 

Cd C [ A ... A Cjti * - 

The derivation d is a membe' of FI 3a( if the constraint- c d is satisfiable. 

The property in which we are interested is finding whether a ground atomic for- 
mula or a rule can appear in a derivation of the query. Inspecting the symbolic 
derivations in n ja( is enough to verify this property: 

Lemma 3.3: . 

1. A ground formula p(a j... ,a n ) can be part of a derivation of the query w if and 

only if there is some node n = p{ X [ \\) in a symbolic derivation d £ LU( 

such that a u . . . , a* Satisfies c n . where c n is the projection of c d on the variables 

•Vt Y n . 

0. A rule r in the knowledge base can be part of a derivation of the query w if and 

°^y if $°me symbolic derivation d £ n aa ( includes, a rule-node containing the 
rule r. 

Proof: To prove Part 1, suppose there exists a symbolic derivation tree d £ U ia t 
and a 1 , I ..,a n satisfies the projection of c d on the variables A h A' rt of the node 
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n £ d. Therefore, there is some mapping 9 of the variables of d to constants such 
that X x = a, for 1 < i <n and such that applying 9 to d will satisfy the constraints 
d d . Applying 9 to d will yield a derivation of v that uses Conversely. 

suppose there is a derivation do of v that uses p[a i a n ). We can replace all the 

constants in d 0 by Variables resulting in a symbolic derivation d. Clearly, the Symbolic 

derivation is a member of ITdi and a t a n satisfies the projection of c d onto the 

variables of the corresponding node in d. 

Part 2 follows from the observation that a symbolic derivation will have exactly 
the same rules as its ground derivation instances. I 

To encode fl aa {. our labeling scheme will be the following. Given a symbolic 
derivation tree d and. a node.n with variables At V m , the constraint-label of 

denoted by L 3at {.n) will be the constraint denoting the projection of a onto the 

variables of A't A' m . Note that since the constraint language satisfies the Closure _ 

property, the label L 3a tU 0 can expressed as a sentence in the constraint language 
£, Intuitively, the label denotes the set of tuples that can appear in. the node ri in 
ground instances of the symbolic derivation d. To show that L 3a t is a finite labeling 
scheme, we, observe the following: 

1. The Einiteness property implies that-the number_of_possible labels (i.e., the 
number of non-isomorphic constraints) is finite. 

2. The -label, of a node, .can be determined by. its children or 'bv its father. The- 
label of a goal-node g is the projection of .the label of its child rule-node, onto 
the variables of g._ The label of a rule-node is the conjunction of the labels of 
its children. 

3. A symbolic derivation d is a member of fl if and only if all its labels are satis- 
fiable. 

In order to build the query-tree, we need a method for assigning labels to nodes in 
the tree. The difficulty in doing so is that the label of a node may depend on nodes 
that appear below it in the tree. Therefore, we cannot construct the tree and assign 
labels in one top-down phase, since the decision whether to expand a node depends 
on knowing its exact label. This problem will arise also in the labeling scheme we 
consider in Sect inn 3.4. In \vhat follows we describe a general niethod for solving this 
problem. 

The solution is based .on. the following observation about computing the labels 
L 3at {n) for nodes in a given. symbolic derivation tree d. Given a tree d, we can 
compute its labels by a two phase process. In the first phase, we start with the 
leaves of d and compute labels based on propagating the interpreted constraints in a 
bottom-up fashion. In the second phase, we compute the labels by propagating the 
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interpreted Constraints in a top-down fashion. (> The procedure is summarized below. 
The labels computed in the bbttom-up phase are denoted by q and the final labels 
are denoted by cj. 

for every goal-node g & d. c 0 {g ) = True. 

for every rule-node r, c 0 (r) = the conjunction of the interpreted children of r. 

Traverse the rule-nodes of d in bottom-up fashion, 

for each node r do: 

/* g is the father of r and g\ g m are its children. */ 

Cb(r) = c 0 (r) A c 0 (pi ) A ... A e 0 {g,n). - 

Cb(g) = -Projection of q(?') on the variables of g. 

/*■ A ‘ote that C(,(sJ is the formula denoting the relation tvhich is the projection 
of R e ,,(r) on the variables mg. */ 

Cf{rdot{d)) = Cb[root(d)). 

Traverse the rule-nodes of d in a top-down fashion. 

for each rule-node f do: 

C]{r) = c/{g) A q(r). 

h.or every n £. 171 , g m< c/(n). = Projection of c/(r) on the variables of n. 

The following theorem shows-that this procedure correctly computes the constraint 
labels of a symbolic derivation tree. The proof is given in Appendix A. 

Theorem 3.4: Let d be a symbolic derivation tfee. For every node n £ d, cj(n) ~ 
Laati^i)- 

The importance of this theorem is that we can Create the query-tree in a way that 
mimics the computation of the labels in. the two phase process. -Specifically, we show 
below that whenever the labels can be -computed in a two phase process, it is enough 
to precede the top-down creation of the query-tree (by procedure build-.tree) by a 
bottom-up computation phase. Informally, in the bottOm-up phase we compute -all 
the possible bottom-up labels for .predicates in the KB. i.e., all the labels c 6 (n) that 
can appear in symbolic. derivation trees. Based on theSe labels, we create a new set 
of refined predicates. For every label c that we compute for a predicate p, we create a 
new predicate p c . We then create. a set of .refined ruleS-Vi for. the refined predicates 
by trying all the possible substitutions of the refined predicates in the rules of V. The 
query-tree is then created in a top-down fashion. using. procedure build-tree and the . 
rules V\. A rule r' £ T\ is. a refinement of a rule r £ "P. if the predicate names in 
r' are -refinements of the corresponding predicates in r, Note- that r' and r have the 
same variable patterns. 

fl A bottorn-Up ordering of the rule nodes is afiy ordering for which a node ft is traversed after all 
its descencichts A to p-dow p traversal is the reverse order of a bottorn-Up traversal. _ 
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As an example of a bottom-up computation, consider the knowledge-base in 
Figure 3.2. The initial labels of the EDB predicates are the constraints that are 
given for them, i.e., {badPoint(X), 100 < A' < 200}, {goodPomt(X), 150 < X < 
170}, {$tep[X, V), X < V} and {bigStep{X.Y),X < 100, Y > 200}. With the 
rules. r* and r s . we. create the following labels for link: {link A (X,Y). X < V*} and 
{lin k 2 (A', V). X < 100, Y > 200}. With rules r 2 and ra, we create {path l (X, V), A < 

} } and {path 2 ( AW). A' < 100. V > 200}. Finally, with path 1 we create the label 
{ gobdPa th 1 (.' V, V). 100 < -V < V < 170, V > 150}. Note that substituting path 2 in 
?’t will yield the inconsistent label for good Path, and therefore we do not perform 
that substitution. The refined rules that are created are: 

r i : badPoint[X) A path l (X,Y) A good Point(Y) => goodPath l (X.Y). 
r 2 : link l (X, Y) => path l (X . V). 
r 2 ; link 2 (X , V') => path 2 ( AM'). 
r 3': link l ( X, Z) A path l {Z,Y) =» path l {X\Y). 
r 2 : link\X,Z) /\-path l (Z.Y) =* path 2 {X,Y). 

Hj : link l {X. Z) A.path 2 (Z,Y) => path 2 (X,Y). 
r *4 : step(X.Y .) =>■ link l (X,Y). 

A>.: bigSiep(X,.Y) => link 2 {X . V‘). . 

Formally, in a two phase computation process we assume the existence.of the following: 

1. Initial labels for goal-nodes .in a symbolic derivation tree, co(n). We assume the 
Initial label of a goal-node depends only on the predicate-Of the node. 

2. A function BU Label{r.{gi g m ), {c b {g x ), . . . , Cb(g m ))} that accepts a tule^ 

node, its subgoals and their respective bottom-up labels and com putes the 
bottom-up label Q,(r) for the rule-node. 

3. A function BUprdj(f,8 x ci){r),g) that accepts a rule-node that contains t. rule 
r. the unifier 8 with .which r was unified with its father goal-node g , and the . 
bottom-up label of the rule-node, and returns h bottonvup label for its father 
goal-node g. 

A. Functions T D Label and T D Pro i as used iii procedure build^tree, for comput- 
ing the top-down .labels, 

We define the result of the two phase com putation as follows (which is a general- 
ization of the computation of L ial )' 

If g is a leaf in the tree,. cHg) — c o{g )• 

• If r is a rule-node with children g\ g m and father g , then 

Cfc(r) = BU Label{r, {g x </m). (<7.(£/i ) Cb{gm)))< and 

o,{g) = lH r proj(r,9,Ch(r),t 7 ). 
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• Cj{rdot(d)) = Ck[voot{d)). 

• c j( r ) — T D Lcib(d(i\cj[g). (]). 

• = T Dproj(f. 6, c/{r).g , ). 

Definition 3.5: The labeling scheme L is said to be 2-phase computable if c/( ?? ) == 
L[n) for every node in every symbolic derivation tree. I 

The bottom-up phase of building the query-tree is shown in Figure 3.5. All the 

labels created for a predicate p in this phase use the variables A\ ,V rt , where n is 

the aritv of p. Therefore, we omit the second argument from BVproj. assuming it uses 
these standard variable names. The Complete query-tree construction is described in 
Figure 3.6. After creating the query-tree we ignore the refinements of the predicates. 
That means that if n is. a node in the tree bf a predicate p c and a label c/(n), we treat 
it as if it is a node of the predicate /; with the same label. -Note that the query-tree 
may actually be a forest of trees if the bottom-up phase computes more that one label 
for the query predicate. 


procedure create-refined-rulesfF) 
begin 

/* Constructing bottom-up labels P for every predicate p. */ 
for every EDB predicate piV.P = {c 0 {p)}. 
for every 1DB predicate p € V, P = {}. 

T\ — {}_/* T\ w iH be the set of refined rules */ 

repeat 

Let r be the. rule q\ A ... A ^ => h. 

Let c, 6 Q,, for l < i < in. 

c = BULabel(r, (g\ g m ), ( Cl e*.)). 

if c is consistent then 
Ch = BVpfoj{r . c, h). 

Add Ch to H. 

Add the rule A ... A q%p => IP 1 ' to V \ . 
until ho hew labels or rules are created. 

end create-refined- rules 


Figure 3.5: Creating the refined rules. 

The following theorem states that whenever there is a 2-phase computable labeling 
scheme for .the set of symbolic derivations IT, then the procedure. build-query i tree 
will build a query-tree that encodes precisely the set of derivations fL 
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procedure bui!d-query-tree(P, q) 

/* V is the set of rules, q is the query predicate. */ 
V\ =create-refined-rules(7’): 
for every label c of q do 

r, = build-tree (V x .q\c): 

Query-tree = Uc6Q shake-tree ( T c )\ 
end build-query-tree. 


Figure 3.6: Buildin g a querv-tree 


Theorem 3.6: Let L be a 2-phase computable finite labeling scheme for the set of 
symbolic derivations Id. Let T be the query-tree generated by procedure build-query- 
tree: 

/.. If d. is a symbolic derivation tree in fl. then d is encoded in T, and the encoding 
is labeU.preserving. 

2. Let d x be a .partial symbolic derivation tree encoded by.the query-tree (i.e., some, 
of. the leaves have fDB predicates ). then there is a symbolic derivation tretd.£ II. 
such that d\ is a. prefix of d and the encoding (limited to the nodes mapped to 
di) is label preserving. 

3, .4 node n appears in a symbolic derivation tree in II with label L[n ) if and only if 
there is some node in the query-tree with label cj(n) such that L[n) is equivalent 
to cj{n). 

Proof: In the proof we assume that the query is of the form q{X) where X is a set 
of distinct variables. Therefore, we refer to the query simply as the predicate q. Note 
that we can always transform the query into such a form. 

We first prove Part 1. Let A be a symbolic derivation.. A simple bottommp 
induction on the nodes of d shows that the bottom-up labels of d were computed by 
create-refined-rules. specifically: 

• For every goal-code g € </. of the predicate p arid bottom-up label c = Q,(g), p : 
is a predicate in V x . . 

• Suppose r is a rule-node in d containing the rule q x A . . . A => p. Suppose r's 

father is g arid children are then the following rule is in V\\ 

qf [3 ' ] A ... Aq^ Sm) =» p C6(s) . 
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Next, wo show that cl is encoded by the query- tree by mimicking the execution of 
procedure build-tree. We construct the encoding mappings tq of the nodes as we go 
along. 

We begin with the root of d and its child rule-node r. Let c denote c b {root{d)). By 
the bottom-up construction we know that q c is one of the refined predicates, and that 
there is a rule rq in T\ that is a refinement of the rule in r. and its antecedent is q c . 
Therefore, procedure build-tree will be called with qE The procedure build-tree 
will begin with a node n with predicate q c and the label c, and it will expand q c with 
the rule zq, Therefore, v will map root{d) to n and will map r to r,. The mapping V, 
will map children of r to the respective children of t,’(r). The bottom-up arid top-down 
labels of root(d) are the same, and therefore C' is. label preserving for fooi{d)< Since 
the labels specify completely the equality relations between the variables, v’(root(cf)) 
is a variable renaming of rdot(d). Consequently, since rq is -a refinement of. the rule 
in r (and. therefore they have the same variable, patterns), the top-down, labels of 
t'(r) and its children are. determined by v{root(d)) (using the functions TDLabtl 
and TDProj) in the same Way that the .top-down labels of r and its children are • 
determined from rodt{d). Therefore, the mapping v is also label preserving for v and 
its children, and specifically. c’(t?) is a variable renaming Of n for. n being r or one of 
its children. 

Let rq . . , r rt be a top-down ordering of the rule-nodes of d. We .prove the claim 
by induction on the i tfl rule-node. We assume by induction that we have, built an 
encoding mapping d that satisfies the conditions of Definition 3.1 for ail the rule- 

nodes rq r,^j. and .their children goal-nodes. Note that for i 1 this is exactly. 

the base case discussed above. We prove that Part 1 holds. for r, arid for its children. 
Furthermore, we assume by induction that if g is a goal-node in d, then ip{g) is 
actually a goal-node of p Cb ^ 9 \ when the refinements of the goal-nodes in the querv- 
tree are considered. 7 Note that this assumption holds. for the foot of d, 

Let g be the father of rq in d. and assume that g is a node of the predicate p. By 
the inductive assumption, V’(s) is a node of the predicate p Cb ^K Assume d<(g) was 
expanded in the query-tree. It would have been expanded with the rule 

?•' : q\' A ... A q^ => h e * ia) 

where cq, . , . , c m are the bottom-up labels of the children of p, and r 1 is a refinement 
of the rule in i\. Denote the resulting rule-node in the query-tree by r, The mapping 
V will niap r, to. the node r and the subgoals of r, to the subgoals of f (therefore 
satisfying condition ED of Definition 3.1).. Note that r‘ is a refinement of the rule in 
r,i arid we ignore the predicate refinements in .the resulting query-tree,, the rule in r, 
and in c*( r, ) are the same (as required by condition El). Furthermore, condition E3 


'l.e.,.wp consider thr* refined predicates in the rules V\ . 
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is. also satisfied by O. As in the base case, since r, arid i'(r ( ) contain the same rule, 
arid i .'(g) is a variable renaming of. g. the top-down labels of r(f) and. its children 
are determined by v[g) (usirig the functions T D Label and T DProj) in the same way 
that the top-down labels of r and its children are determined from g. Therefore, the 
mapping U is also label preserving for r and its children, and specifically. !/(ri)iJs a 
variable renaming of.n, when ?i is either r or one of .its children. 

If v[g) was riot expanded in the query-tree. it. would be because Eq{v{g)) is 
expanded. In that case. t would be a child of Eq[0[g)). Iri this case, E3 is still 
satisfied by the second clause in its definition. EO and El hold as before. Since the 
label of Eq{v[g)) is isomorphic to the label of (and in particular. Eq(y[g)) is a 
variable renaming of U[g)). the mapping in will be label preserving also for r, and its 
children. 

To complete the proof of Part. 1 .we must show that riOrie of the nodes, r[n) for 
n.€ d were deleted front the query-tree in the shaking phase. However, a simple 
bottom-up induction on the nodes of d, will show that all nodes were marked 
accessible, and a top-down induction will show that they_were all marked relevant 
and therefore riot deleted. . 

We prove Part 2 in two parts. First, we show that every partial derivation en- 
coded by the query-tree is a. prefix of a complete symbolic derivation encoded by the 

tree. Next we show that everv sv.rribolic derivation encoded bv the tree is a svmbolic 

* • » 

derivation in fl. 

To prove the first part, we note that every goal-node in the query-tree is the foot 
of some symbolic derivation (and the label of that node is the label of the root of 
the tree). In proof, if g is a node in the query-tree, then there is a sequence of nodes 

rij rim such that ri m = g arid ij, was marked accessible because of some ti } for 

j < i. The node g is a head of a symbolic derivation consisting of nj,.,, ,n m . A 
simple induction on the reverse order of these nodes shows that they, were all marked 
relevant and are therefore all iri the query-tree. 

Consequently, given a partial derivation tree d\ encoded in the query-tree, We 
can complete every IDB leaf of di with a symbolic derivation, thereby constructing a 
complete derivation. 

To complete the proof of Part 2. let d 1 be a symbolic derivation encoded by the 
tree. Simply consider the symbolic derivation tree d with exactly the same structure 
(i.e., the same rules). Because the labeling is 2-phase computable, the labels of d! will 
be identical to the iabels of d. Since the query-tfe does riot contain nodes with the 
inconsistent label, d y will be a member of FI. 

Part 3 follows from the first two parts and the observation that every node in the 
query-tree appears in some partial derivation tree encoded by. the query-tree. I 

We complete this section with the following corollary that shows that the query- . 
tree provides a sound and complete inference procedure for strOrig-irrelevarice for 
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Horn-rule KBs with interpreted predicates. 

Corollary 3.7: Let V be a Set of rules with interpreted predicates that satisfy the 
Closure, Equivalence, Satisfiability and Finiteness properties. Let T be the query-trie 
created for the rules V- arid the query q. 

1. .-1 formula p{a i a„) is strongly irrelevant to q w.r.t. 5p. (i.i.. the irrele- 
vance claim Sl[p[a j a n ). < 7 , S-p. Dp, XX,) holds), if and only if there is no 

nbde n of p in T. such that a\ — ,.a h satisfies the constraint label of n, c/(n). 

3. A rule r is strongly irrelevant to q if and only if it does not appear in T . 

Returning to the example in Figure 3.2, the rule r 5 is strongly irrelevant to 
goodPath because it does not appear in the query-tree. The atomic formulas of 
stip[\, } ) for Which .V < 100 or.T. > 170 are also strongly irrelevant to the query. 

3.2.1 Conjunctive Dense-order Constraints 

One of the constraint languages which was. covered by the discussion in the previous 
section is £ A,V , i.e.. dense-order constraints with conjunction and disjunction. An 
important. restricted language is that of dense-order constraints in which only con- 
junctions are allowed, which we denote by £ A . The atomic formulas of this language- 
are of the form (..V 0 } ) or (.V 6 a), where A' and Y are variables, a is a constant, and 
6 €.{<. <-•>, >, — . ?£}. Formulas in the language are either atomic or conjunctions of 
atomic formulas, In (Ullman, 19S9], a complete polynomial-time decision procedure 
for this language is presented. Unfortunately, this language does not satisfy the Clo- 
sure property we require. . Specifically, given a sentence c in £ A , there is not always 
a Sentence in £ A that expresses the projection of c on a subset of its variables. The 
following is an example of such a case. 

Example 3.8: Consider the conjunction: 

A'» <Z,Z< AV Z ± A 3 

This conjunction implies only one conjunctive constraint. A'i < A' 2 , among the vari- 
ables Ai.A'j.AV However, that does not fully describe all the constraints among 
Ai, A i , and .\ 3 . The constraint .V 3 ^.A’t V .V 3 £ . V 2 is also implied by this conjunc- 
tion, but since our language does not allow disjunctions, we cannot express this when 
trying to project the above conjunction-onto .V ]; ,V 2 . and A' 3 . I 

In creating a query-tree for constraints expressed in £ A . we modify the projection 
functions ( Bl'proj and TDproj). Since we cannot always express the exact projection 
of a constraint in £ A . these functions return a weaker constraint. Specifically, given a 
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constraint c on variables .V and a subset V C X . the functions return the conjunction 
of all the atomic constraints on variables in V that are implied by c. In our example, 
the projection would be A'i < X^. 

Consequently, the labels computed for nodes in the query-tree, which we denote 
by Cj[n), are weaker than the labels given by the labeling scheme L sa t- Therefore they 
do not describe the tightest constraint on every node in the query-tree. Fortunately, 
we can show that these labels are closely related to the labels givert by L sa , and that 
we can use them to deduce strong irrelevance. Informally, the difference between the 
resulting labels c^(n) and L iat (n) is that c^(n) may be missing some disequalities 
(/) between the variables. Formally, we show the relation between them through the 
least equality extension of .a conjunctive label, defined as follows: 

Definition 3.9: Let c be a constraint on the variables A'i Y rt and constants 

«i a m . The least equality extension of c, denoted by Max^(c) y is 

c A {.V, 7 ^ Xj | c-Ji ,V ( = Xj A 1 < is j < r? } A {.V,V a , | C A', = a , A 1 < ,/ < m } 

I 

The least equality extension simply adds disequalities between every pair of vark 
ables (or variable- and constant) that are not required to be equal by c. Note that 
the least equality extension, is unique and therefore well defined since it Can be -built . 
incrementally by examining, each pair of variables (or variable and constant), and 
the order of the construction does not matter.. In our. example, the least equality, 
extension of A'; < A'j is {A'i < A'j A A' t ^ A'n A A'j ^ A'3}. 

The following theorem relates c^(n) to c/(n) (which was shown to be equivalent 
to L jQt(n)). It shows that the label Cj(n) is never stronger than c/(n) and that c/(n) 
is never stronger than Max^(c / j(n)). Recall that q denotes the conjunction of all the 
interpreted constraints on variables in a symbolic derivation d. 

Theorem 3.10: L&t d bi a symbolic derivation tree for uthich dj is not necessarily 
satisfiable. For every node n € d, 

l. cj(n) c£(m) 

If Cf{n) is satisfiablr, then .Ma.r^(cy(n))(= c/jn). 

If for all n, c'l(il) is Satisfiablc, th€h for all nodes n, cj{fi) is satxsfiable. 

The first and Second parts guarantee the relationship between the labels in the 
query-tree constructed with £ A and the. labels of L ta i • The third part guarantees 
that when we build a query-tree with labels the tree will not have any nodes that 
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shouldn't be included, i.e., all nodes are satisfiable (since they are weaker than the 
satisfiable constraint Max^(n)). The proof ol the theorem is given in Appendix A. 

As stated, the labels in the resulting query-tree will not be the tightest ones 
possible. That means that if c is a label of a goal-node n in the query-tree, then 
actual tuples that can appear in valid derivations of the query may be a strict subset of 
those satisfying c. Formally, we can use the resulting query-tree to deduce irrelevance 
claims as follows: 

Corollary 3.11: Let p(gi a n ) be a ground atomic formula. 

1. If «i a n does not satisfy any of the constraint labels of nodes of p in the 

query-tree , then it is strongly irrelevant to. the .query. 

2. Let g be a.jwde of the predicate, p in the query-tree. If a^...,a^ Satisfies 
Max?Xcj[g)), then p(a |,...,c/ n ) is not strongly irrelevant to the query. 

.4 rule r is in the query-tree if and only if r is not strongly irrelevant to the 
query. 

Proof: The query-tree encodes the set of symbolic derivations d in which for all n € d 
the labels c A (n) are satisfied. Part 1 follows from Part 1 of Theorem 3.10 and Part 2 
follows from Part 2 of that theorem. For Part 3, consider a symbolic derivation tree 
d encoded bythe query-tree that uses a r.ule r. All of its. labels. are sat-isfiable, and 
therefore, by Part. 3 of Theorem 3.10, .cj is also satisfiable. Consequently, there is a 
symbolic derivation of the query that satisfies ll ja< and includes r. Since the query- - 
tree encodes a superset of the derivations in II. then clearly if r does not appear in 
the query-tree, then it is strongly irrelevant to the query. M 

Note that even though we do not. get the tightest labels on the nodes when we use 
the language £ A , there may be advantages to using it over using £ A,V . Specifically, 
when we allow disjunctions, the constraints may become long (many disjuncts), and 
furthermore, checking equivalence of two constraints involving disjunctions is a more 
expensive operation. Therefore, the time to build the query-tree can he significantly 
affected. 

The Number of Labels . 

As stated earlier, the time complexity of building the query-tree is dominated by the 
number .of possible labels we can attach to nodes in the tree. 

Consider the case of dense-order constraints expressed in £ A,V . In that case, every 
Constraint label describes a set of possible orderings on the variables and constants in 
the rules. Given rf vaiiables and rn constants, the number of possible total, orderings 
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On them is exponential in 72 + 777. Therefore, the number of constraint labels is doubly 
exponential in n + 777 , However, we note that it is sufficient to consider only total 
orderings on the variables and constants, arid therefore the query-tree can be built 
in time that is singly exponential in n + m. However, in practice, the number of 
constraints that will be computed will be much smaller than the number of total 
orders and therefore, it is better not to limit ourselves to total Orders. The number 
of labels expressible in £ A is exponential in n + 777 , because it contains a subset of 
atomic formulas of which there is a polynomial number. 

3.2.2 Rules With Function Symbols 

When the set of rules contains function symbols, the Finiteness property may not 
hold. The source of the problem is that when a goal-node is unified with the. head 
of a rule, new terms may be created, and therefore the number of labels that can be 
created may be infinite.. Consider the following example. 

Example 3.12: The following rules define the set of integers: 

: (.V .= 0 ) =*■ integer [X) 

S2 : integer(X) => integer(X Jr 1 ) 

As shown in Figure- 3 . 7 (a), a top-down expansion of the tree for these rules will result 
in an infinite number of labels {Z, ~ X - 7} for every integer. 2. Therefore, the 
construction of the query-tree will not terminate. I 


integer.(X) 
si ^ S; 


U 


A* = 0 ititeger(Y) {>' == A* — 1 } 

/\ 

Si 


Sn 


Y = 0 inicgcr(Z) {Z = .V - 2} 

(a) 


integir(X) {} 

X\ 

S{ 6'n _ 

X L 0 irUeser(V) {} 


(b) 


Figure 3 J: Query-tree with function symbols. 

To build a. query-tree in- this case we can assign the nodes in the. query-tree one of 
a finite set of labels C. When we project a constraint on a subset of its variables, we 
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proceed by the following strategy. Given a constraint c and a subset of its variables A. 
if there is no label in C which describes the exact projection c|jf, we assign a member 
C[ of C such that tj .*• ^ c.\ and such that there is no other constraint C 2 £ C such that 
c 2 [=j C! and c|^ e 2 . The constraint C] can be viewed as the best approximation 

to c|jf out of the finite number of labels C. Consequently, the resulting labels in 
the query-tree are weaker than the tightest ones possible, and therefore, the query- 
tree provides only a sufficient condition for strong irrelevance. That means that a 
ground atomic formula which does not match any of the nodes in the tree is strongly 
irrelevant, but not vice versa. 

One way to assign such a finite set of labels .is to not allow new terms to be 
created in the labels (or to allow a maximum of A new terms, where A is fixed). For 
instance, in our example, if we do not allow new terms, we get. the query-tree shown 
in Figure 3.7(b). 

Finally,, it should be noted that. the problem with function symbols arises only 
when the rules are. recursive. If they are not, then the number of labels will neces- 
sarily be finite (because the number of unifications is finite, .and each unification may 
introduce only a finite number of new terms). Consequently, in such cases, the query- 
tree still provides a complete-inference procedure for strong irrelevance (assuming the 
constraint language satisfies the properties described in the previous section). 


3.3 Encoding Minimal Derivations 

In this section, we Consider another instance of the query-tree algorithm in which the 

query-tree is built to encode only the minimal derivations of the query. The definition 

of minimality that we consider (Ml from Chapter 2) states that- a derivation is minimal 
if there are no two identical nodes, n x and n 2 , such that n i is an ancestor of n 2 . 

In Chapter 2 we showed that strong irrelevance w.r.t. this definition is equivalent to - 
strong irrelevance w.r.t. the stronger definition A/2, and provides a sufficient condition 
for strongjrrelevance under the condition A/3. _ 

Example 3.13: Consider the following knowledge base, where e is the.EDB predicate 
and p and p\ are IDB predicates. 

r x .:p(V,.V)^p(.Y,V). 

H : e(A\ V’) =*> p(.V, Y). 
r 3 : p(A\ A) =£ pi(.Y). 

The rule f\ can only appear in noil-minimal derivations of p\ . Note however, that rq 
will appear in minimal derivations of p. I 
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As before, our first step is to find a set of symbolic derivations that the query- 
tree will encode. We build a query-tree that encodes the following set of symbolic 
derivations n min . A derivation d is a member of n m ,n if: 

1. d £ n ja< and 

2. There is no pair of nodes p(.Yi A' n ) and p(V'i V„) for any predicate p. 

such that Cd (= (A'i — Y\) A ... A (A' n = V n ), and such that p(A'i, . . . . A' n ) is 

an ancestor of p(\\ V n ). (Note that in this case, the two nodes in the tree 

would be identical). If two such nodes exist we would say that the tree contains 
a loop* 

In showing the properties of n mtn , we make the following additional assumptions 
about the constraint language used in the rules: 

Density; /Suppose that R/[ A'i ,Y n ) is the relation consisting of the tuples satis- 

fying the formula /. and. for ail 1 < i < j < n. formula / does not imply A', = X } or 

A', = a.. where a is a constant. Let A' 1 X k be k columns in the relation i?/. and 

let R 1 be the set..of all tuples in Rj in which column i, 1 < i < k has the constant a,, 

where aj,, an are arbitrary constants. The Density Property requires that R' be 

an infinite set or an empty set. 

Intuitively, the property guarantees that if we are given a partial assignment to 
variables 'hat. satisfy a certain constraint, then we. can complete the assignment in 
arbitrarily many ways. The reason the assumption .is needed is that we want, to 
guarantee that if a symbolic derivation tree d is in then we can always find a 

corresponding ground derivation in which every variable in d is mapped to a distinct 
constant, and will therefore be a minimal derivation. 

Equality connectivity: Suppose the variable A' appears in the goal-nodes g\ and 
<?2 in a symbolic derivation tree d. Let g be the least common ancestor goal-node of 
g\ and gi in d. Then A' appears in every goal-node on the path from g to gi and on 
the path from g to g^ . 

Note that both of these assumptions hold for constraint languages using order 
predicates, as long as the domain of the variables is assumed to be dense, or for the 
constraint language containing only equality. When the rules do not contain inter- 
preted predicates, they cart be viewed as using the constraint language of equalities, 8 
and therefore the Density property is satisfied. Furthermore, it should be noted that 
if these properties are not satisfied, then the querv-tre still provides a sound infer- 
ence procedure for strong irrelevance. This means that a node may appear in the 
query-tree and still be strortglv-irrelevant to the query. It should be noted that even 
when these assumptions do not hold, finding examples in which the query : tfee is not 
complete requires careful crafting of the rules. 

8 Betause equalities can be represented implicitly by multiple occurrences of the same variable. 
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Under these assumptions, we can deduce properties of minimal derivations by 
inspecting symbolic derivations in n min as follows: 

Lemma 3.14: 

1. // pirii . fl„) does not satisfy any of the labels of nodes of p in any tree d € 

n min . then a n ) does not appear in any minimal derivation of the query, 

3. If p(a\ a r .) satisfies the constraint-label of some node n (of the. predicate p) 

in. a symbolic derivation .tree d 6 LUm- and the equality relations between the 
constants At. ... ,a n are only those that are entailed by the constraint label, then 
there is some minimal derivation of the query that uses p(a . . . . a ,f ] . 

3.' A rule r is used in a minimal derivation of the query if and only if r appears in 
some. symbolic derivation d 6 Ii mm . 

Proof:. To prove Part 1. if p(ai a„) does not match any node-in symbolic deriva- 
tions in n mt „, there are two possibilities. The.first is that p(ai a,i) does not- match 

any node in symbolic derivations in n j(2( . .If this is the case, dearly p(a\ a n ) does 

not appear in any derivation of the query (bv Lemma 3.3). The other possibility is 
that it only appears in derivations. of n 3ai whose -corresponding symbolic derivations 
contain a loop. However, everv'instance of a symbolic derivation_that contains a loop 
will contain, a loop, and will therefore not be minimal. .. 

Part 3 is.proved.as follows.. If d’S n min uses the rule r, then d € n ia( .. Therefore cj 
is satisfiable. Consider an assignment v of the variables in d that satisfies Cd and such 
that v assigns two variables Ah and Ah the same value only if q )= Ah = Ah- Since 
d contains no loops, the derivation dv will be a minimal derivation. If it were not* 
that would imply that there are two nodes Si(.Y) and g\[Y) such that q j= -X = Y. 

Conversely, if there is a minimal derivation that uses. r. consider, its corresponding 
symbolic derivation d. Clearly, d 6 II min . 

The proof of Part 2 requires the Density property. Let Ah,, < . . A' m be the variables 

in d and assume Ah X n are the variables that appear in the node whose label is 

satisfied by p(0 1 a n ). We show that there is an assignment v to the variables of 

d that satisfies q and such that: 

1. for 1 < i < n. X, — a, and 

2. for 1 < i.j < nu .V, = A' ; (or X, = a) only if it is implied by c^. 

Applying v to d will yield a. minimal derivation of the query that uses p(a'i, . . . ,cZn)- 

We start by assigning A',. = a,.for.l < i < n. Note that this mapping satisfies the 

second condition .because (4 a n only satisfies the equalities required by the label 

it matches. We proceed by .induction on i. Given assignments to Ah A", we need 
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to assign a value to A' 1+1 . If c.j t= A'j + i = Xj for j < /, then we assign Ahfi the value 
assigned to Xj (similarly, if cj j= A',+i = a. we assign n). 

We now show that there are an infinite number of assignments n,+i to .Y,+| such 
that will satisfy c rf , and therefore we can choose a value a f+ i that is not 

assigned to any of Ah Y,. 

Let 6 1+2 be an assignment to the variables A’ 1+2 A’ m such that the tuple 

consisting of ^(Ab) v[ X t ), 6, +2 ,6™ satisfies q. Note that & l+2 b m must 

exist because v(A'i) c*(A',) satisfies cj. Now consider the selection on Cd in which 

A'j = tr(.Y ; ) for 1 < j < i and X } = b : for i + 1 < j < m. This is the subset of Rj 

that is equal on the columns A't A',. A', +2 V rt , and therefore, by the Density 

property must be infinite. This means that there are an infinite number of values 
that A' 1+ i can take that will be consistent with.t’(.Y) ) c(A7) and with cj. I 

It is important to note that Lemma 3.14 implies that if we can .build a query- 
tree to encode fl mtn , then strong irrelevance under minimal derivations is decidable. 
The only;subtle point that needs to be considered is when we have atoms of the 

form p(at a n ) that Satisfy .some-label in some. derivation tree in n min , but where 

ct i a n satisfy additional equalities.not implied by the label. In such a case we can 

create a specialized predicate p‘ that enforces these equalities. For example, if we had 
an atom p(A'.A'), we would create a predicate p'(X) defined, as p(A7 V) A (A' = V')- • 
We then consider every rule in the KB in turn. Whenever a rule can uSe the predicate 
p, we make another version of it that uses p'. 9 We then build a query-tree for the 
KB that includes the rules.with p 1 ' and check if p\a\, . . . , 0 ^) matches. a node in 
that query-tree (where a[ ti! m is the result of ’removing duplicate constants from 

C) , . . . ) Gn)‘ 

Our next step. is to devise a labeling scheme for n m ; rt . A label of a node n, denoted 
by L m in{n) will be a pair (c.t) where c = L sat {n) and t will be the tag of n, defined 
as follows. We denote by V(g) the variables that appear in the node g. 

Definition 3.15: Let g be a goal-node in a symbolic derivation tree. Let 5 be the 
set of its ancestor goal-nodes that have- only variables from V(g) or constants. If 5 
contains a node that is identical to g , the tag of g is inconsistent. Otherwise, .the tar 
of g. denoted by T[g) is 5U {</}. The tag of a rule-node r £ d is the tag of its father 
goal-node. I 

Two labels (c t , f i ). and (c 2 , f 2 ) are the same if there is an isomorphism between c t 
and c 2 which is also art isomorphism between t j and t 2 . 

Lemma 3.16: The labeling L, mn is.a. finite labeling Scheme for-Ufmn. There £rist 
functions BV min and T D imn Such that L mtn is '3 phase computable. 


Nf p appears in two or fntire subgoals of a rule r we make a version for (Wer v subg oal. 
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Proof: To show t hat there is a finite number of labels, it suffices to show that there 
is only a finite number of possible tags. Consider the atoms of a predicate p in a 
tag. The number of different atoms of p is the number of possible variable patterns 
of the arguments of />, which is the number of ways to partition the arguments of p 
into equivalence classes. This number is exponential in the arity of p (cf. [Graham 
(t ( il. , 1 989] . pg. 2-M). Therefore, th e number of atoms that may appear in a tag js 
exponential in the maximum arity of predicates in V. Consequently, since a tag is a 
set of atoms, the number of possible tags is doubly exponential in the arity. 

Next, we observe that the tag of a goal-mode can be determined by the tag of 
its father. Let g be a goal-node whose grandfather goal-node is g\ K and Suppose 
9 € T{g) s i.e.. g' is an ancestor of g that has only variables from \ ‘{g) or constants. 
By the Equality connectivity assumption, V (</'.) Q \ ’{gi) (because g\ is on the path 
from g' to g). Therefore g' € T{g x ). ana g' will be-in T(r.), .where r is. the. father 
rule-node of g. Consequently, 7 (g) can be determined by inspecting only the atoms 
in T(r). 

Since T[g) can be determined by the tag of its father, it also follows the L mtfl 
is 2 phase, computable. In the first phase, the computation is identical to that of 
and so we define to be identical to BU Label, In the second phase, the 

function T D mxn {r,0.(c.T(r)),g) will compute a label (c # , T{g)) as follows. The first 
component is simply that computed by T D Label, .i.e., c' ~ TDLabel{f.6 % c.g). To 
compute the tag of g. simply inspect the atoms in the tag of its father r. Aside from. 
g itself, any atom that includes only variables from g and Constants will be in T(g). 
If g € T{r) then T[g) will be inconsistent 

Finally, we need to show that FI min can.be verified by inspecting the labels L m ,„ 
of the nodes in the tree d. This follows -from -the following observation: d 6 n min if 
and. only if for .all. n e d, L mtn (n) is not inconsistent, i.e.., if L mxn = (c. f), then c is 
satisfiable and t is not inconsistent. 

In proof, suppose, there are two identical nodes rq and n 2 in d such that n.[ is an 
ancestor of 2 . and let r be the father of n 2 .. The tag T(v) would contain 71 1 (because 
of the Equality Connectivity assumption) and therefore,. when computing T{n?) we 
would get an inconsistent label. Conversely, if lor sOrnc node n 7’(n) is inconsistent, 
it must be the case that n 6 T(r). where r is the father oLn.-This means that one of 
the ancestors of n is identical to u. I 

The corollary below follows from Lemma 3. Hi and Theorem 3.0. 

Corollary 3.17: Tin procedure build=query-tree with functions. BU mtu and TD miix 
wdl compute a~<iucnj-i n e that tncodis pri nselg the s 1/ wJiQ-lLc derivation s in Il mm . 

'I he following example, illustrates the use of tags and shows that, they are indeed 
necessary in order to derive strong irrelevance under fnimmal derivations. 
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Example 3.18: Consider the following knowledge base. The e,'s are the EDB pred- 
icates. 

ri : q(X.Z)Ac{Z.Y) => cj(X.Z) 
r 2 : C|(.Y, Y) => q(X.Y) 
r 3 : p{ .V, Y) ** q{. Y, Y) 

U:e 2 [X.Y)^p{X.Y) 

r,-.q{X.Y)*p{XW) 


Since there are rio interpreted constraints in the rules and no equalities between 
variables, the constraint labels of all the nodes in the query-tree (see Figure 3.8(a)) 
will have the True constraint. Note, that we do not expand the node q(X.Y) with 
rule r 3 because it will. result in a subgoal p(.Y, Y) which is identical .to the root of 
the tree, and will therefore produce an-inconsistent tag. However, we can expand the 
node r/(,Y. Z) with r 3 without creating a loop. Therefore it is important to distinguish 
between q{. X.Y) and q{X,Z) even though they have the same constraint label L stxt . 
Since £huh(<?(.Y. Z )) is the shine as L min (q(.\\T )), we do not expand the latter.iiode. 

Figure 3.8(b) shows the query-tree that, would be obtained using the labeling L, at . 
Iri this case, the node q(. Y, V) would have been expanded with.-r 3 and therefore, the 
query-tree would encode also non-minima! derivations. I 

3.4 Rules with Negations in the Antecedents 

Recall. from Chapter 2 that if we have a set of. rules with stratified. negation, then 
strong irrelevance is undecidable ( Lemma 2.18). In this section we discuss a restricted 
case of stratified rules- in. which. only literals of EDB predicates may appear negated 
in the rules. In .this case, a derivation can he viewed as a tree as before, except that 
some of the leaves of the tree may be negated literals. A negated literal -'C(d) is 
considered to be satisfied if the ground atom e(a) is not in the knowledge base. 

As before, the query-tree will encode a set of symbolic derivations, in this case 
denoted by n„ es . A symbolic derivation d wilLhe_a_member of 11 nig if 

1. d 6 tl 3a( and 

2. There is no pair of ground atoms t (.Y.) and -'t(V) such, that e d h= (.Y = Vl)* 
i.e.. there is no pair of contradictory leaf atoms. 

The second condition guarantees that the symbolic derivation is satisfiable, i.e.. 
it does not require that both an atom and its negation be in the knowledge base. 

Idle following theorem shows that encoding ii nrs will enable us to decide strong 
irreteVance: 
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Figuce 3.8: (a) A query-tree with node-tags (shown only for IDB goal-nodes), (b) 
The query-tree that would have been produced without considering tags. 

Lemma 3.19: 

I- tf Pi a f «») not satisfy any of the labels of nodes of p in any tree d € 

n„„. then p(a\ a n ) does not appear in any valid derivation of the query. 

‘d- If p(o|,...,« n ) satisfies the constraint-label of sdn\e node rj (of the predicate 
p) tn d. symbolic derivation tree d £ n n ^. and A lt .,, ( a n satisfies only the 
equalities required by the constraint label . then there is Mine database in which 
p{(i \, . . . , «,j) is Used in a derivation of the query. 

J. .*t rule appears in a symbolic derivation in ll nis if and only if it is not strongly 
irrelevant to the. query. 

Proof: The proof is very similar to that of Lemma 3.14. For Part 1,. suppose' 
/)(rii,.. part of a valid derivation d' of the query, and let d be, a, symbolic 

derivation correspon ding to d' . C learly, d' £ fl nr 
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Tho proof of Part 2 follows from the claim shown in the proof of Lemma 3.1-1. 

There we proved the following claim. Let A'i V m be the variables in cl and 

assume .Y i . ‘ Y„ are the variables that appear in the node whose label is satisfied 

by p(ai fl„). Then there is an assignment r to the variables of d that satisfies c ti 

at'id such that: 

1. for 1 < i < ». i‘(.Y, ) = a, and 

-• for 1 .< i.j < m. t’(.Y,) = i'( Xj ) (or f(A' t ) = a) Only if the equality is implied 
by q. 

Suppose we apply v to cl. Since d € Fl nes , d does not have two contradicting 
literals. Therefore, dv will have two contradicting literals only if two distinct variables 
A'j arid A' ; .. such that.c,/ X, — -A', . were mapped by v to the. same .constant.- 
However., that contradicts the .assumption on v. Therefore. is a valid derivation 
of the query that.uses />( « j a„). 

Part 3 is proved exactly as in the proof of. Lemma 3.14. 1 

As in the case of minimal-derivations, it. should be noted that if we can build 
a query-tree to encode precisely n„,. g . then strong irrelevance is decidable for such 
rules. The next step is to devise a labeling scheme for fl neg . For clarity, we begin 
with the case in which there are .no .interpreted predicates in the rules. Furthermore, _ 
we assume that no positive subgoal or head of a rule in the. KB has the same, variable 
in two or more. columns. Rules that do not satisfy this assumption can be converted 
into such a form using equality constraints arid will therefore be covered later. 

Note that the Density property holds trivially in this case. Furthermore, all uni- 
fications of rules with subgoals are trivial. Our labeling scheme iri this case Will be 
L nr3 , which is defined as follows. A label of a node n is a pair (c, e), where c = L 3at {n) 
arid c is the EDB-label of 7i. defined as follows: 

Definition 3.20: EDB-label:. Let r be a rule-node in a symbolic derivation tree d 
and let g be its father goal-node. Let 5 be the set of all EDB literals that appear in 
the subtree rooted in r. We say that the set 5 is consistent if it does hot contain ah 
atom A and its negation ->.4. If 5' is consistent, then the EDB-label of rule-node, f 
c[r) is the set of literals of 5 that contain only variables from g or constants. If 5 is 
not consistent, then the .EDB iabei of r is the inconsistent label. The EDB-label of an 
1DB goai-hbde g is the satiie as the EDB label of its child rule-node. The EDB-label 
of an EDB goal-node is the set containing itself. I 

As before, .two sets of literals t| and (■} are considered to be identical EDB-labels if 
there is l-l..mapping v of the variables of ei to the variables of e 2 such that i/'(et) = 
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Example 3.21: Consider the following knowledge base: 

71 : f(.Y.V) =* 7>(A'.n 

7'2 : c(XsZ) A ->£r(.Y) A ~'(j{Z) A p[Z.Y) => />( .V. V) 

73 ;/>(A\n A ff (n A-.CI.Y.V) =*eLY.n 

Rules 71 and 72 define a p (path) relation in terms of EDBs e (edge) and g (good 
nodes). Rule 73 defines a c (connectivity) relation. Figure 3.9 show a symbolic 
derivation tree created from this knowledge base with its EDB labels. I 


r(.V. V) (“>fl(.V), <j(V). ->f(.Y. V)} 
73 . 

{ ->g ( .Y ) } p( X . V ) gfVt%-(AVV) 



•Y. Z) -<?(. Y) -;>(Z). 'i>{7. .V){e(Z.V)) 


l 

71 

I 

HZ.Y) 


Figure 3.9: Symbolic derivation tree with EDB labels 

The following proposition shows that an EDB-label can be computed from the 
EDB-labels of its subgoals. 

Proposition 3.22: The EDB-label of a rule-node r can be computed by the EDBr 

labels Ci C m of its subgoals as follows. Let S be e | U ... U£m. 7/5 is consistent , 

then the EDB-label of r is the set of literals in S that contain only variables appearing 
m r s father, g, Otherwise, the EDB=label off is inconsistent. 

Proof: The proof follows from the Equality Connectivity assumption. Specifically, if 
a variable A appears in two nodes n t and rj 2 in a symbolic derivation tree such that. 
n i >s aii ancestor of u 2 , then it appears in every goal-node on the path from ri i to 
n 2 . Therefore, if an EDB. literal g ( contains only variables that appear in one of its 
ancestors g, then either < 7 , is a subgoal of g or there is a subgoal <?[ of g which is an 
ancestor of g e . In the latter, case, g r will be in the EDB-label of g x because ali the 
variables of g r must appear in g x . In both cases, g e will be in the set S defined above. 
I 
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Lemma 3.23: 

/. The nitmbtrof EDB-labels is finite. 

d. .4 Symbolic derivation tree cl is a member of n ne3 if and only if none of the 
rule-nodes of d has the inconsistent label. 

There exist functions Bl'neg and T Dneg, such that L neg is d phase computable. 

Proof: To prove Part. 1, we observe as in the case of node tags, that the number 
of atoms that can appear in an EDB label is exponential in the maximum aritv of 
predicates in V (though in this case, the number is double since they can either 
appear positively or. negatively). Since an EDB-label is a set of atoms* the number, 
of EDB-labels is doubly exponential in. the aritv of the predicates. 

Part 2: .Suppose d has a node r with an inconsistent label (note that the incon- 
sistency caa only come from the EDB-label. since -equality constraints are always 
satisfiable). That means that r has an. atom and its negation .in its subtree, cuid 
therefore, d ^..H neg . Conversely, suppose d contains an atom A and its .negation ->..4. 
Let r be the least common ancestor goal-node. of A and ->.4. The EDB-label of. r will 
be inconsistent. 

To prove Part 3. we define two-functions BUneg and T Dneg. Note that under the. 
assumptions we have made, the constraint part of £ nrs is always the True constraint. 
The EDB-label part is computed by Bl'neg as defined in Proposition .3.22. The 
function T Dneg_ is simply the identity function, since the EDB-label does not change 
in the top-down phase: The proof follows from Proposition. 3.22. I 

Corollary 3.24: The procedure build-query-tree with the functions BUneg and 
TDnbg will compute a query-tree that encodes precisely the set of derivations n„ ej . 

Returning to Example 3.21, the first step of the query-tree algorithm will produce 
the following EDB labels. Note that to avoid confusion, we use variables in the 
EDB-labels that are disjoint from those that, appear in the tree. Rule T1 derives 
the EDB label {eRYi.A'j)} for p. Using the EDB label {cRY^A^)} for p, rule T2 
derives the EDB label { — « .gr ( A' t ) } for />. Using the EDB label (-vj(A'i)} for p in rule 
T 2 generates the isomorphic label {— >^( )} for p. Thus, no more EDB labels for p 

can be derived. Using the EDB label { “ *£ 7 ( A'i ) } for p, rule T 3 derives the EDB label 
A'i }, y( A' 2 ), “ , f'(.\i , A' 2 )} for c. Using the EDB label {e(A'i, A'j)} for p in rule T3 
generates an inconsistently, and no new EDB label is derived. Consequently the set 
of refined. rules is the following; - 

TV : r(A\r> =* pW'- x *»{X,Y.) 

1% : c(.Y, V) A -<7(.V) a ^J(Z) a pW x " x *V{ZiY) =» pUW i>l(A, V) 
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T2' b : f(A'.V) A -><7( .V ) A -(?(Z) A/>(^(-V.)}(Z.V) ^ /W<' v ‘»( A'. Y ) 

T:y : p^ (A ',)}(A'. V) A g(Y) A -e(AW) W fl(A ' l) ' 9 < A '*>'^ (A W>)(A\ >.') 


The query-tree created for this example is shown in Figure 3.10. Note that we do 
not expand the rightmost node p^ Xi ^{Z. V), since its EDB label is the same as the 
EDB-label of node }'). 

c {-.g(A',l g A’jl.-ff A'i.A’ji} V) 


7T 


pt-s(A v)} ( Yl >■)<?{>*) ~>e(.Y-, V) 



f(.V.Z) -5(. Y) -'g{Z)pSTi x jJ 3)] {ZA‘) c(A'.Z) - V (.vj 

Tl 1 

I 

e(Z.Y) 


Figure -3.10: The query-tree built for the program Pi. 


Adding Interpreted. Predicates 

In the previous section we. showed that we can compute L neg in a 2 phase procedure. 
However, that result depended on the observation that we knew all the equality 
relations between variables, in the tree during the bottom-up phase. Specifically, if a . 
variable in the body of a rule must be equal to a variable in the head, then we would 
know that in the bottom-up computation of BUneg. The assumptions We made in the 
previous section guaranteed that property because rules could not imply any equality 
relations between variables. However, when we allow the rules to have interpreted 
literals, this assumption may not hold, and therefore, the EDBdabei computed may 
not be correct. The following example illustrates the problem. 

Example 3.25: Consider the following rules: 

r, :ei(.V..V) Ae 2 .(2) A ,V.< Z .< V => p(AW) 
r 2 ; e 3 ( A\ Y-.T) A -r 2 ( T.) A A' > T > > AW, T) 
r.i : /?(A\ V ).A <j(A\ V. T) => A(A'. V) 

The EDB-labels would be coniputed as follows. Using r t , we first create a label 
{ei(:’. i, A' 2 )} for p. Note that e 2 (Z) is not included in the EDB-label because Z does 
not appear in the head of r t nor is Lt_knOwn to be equal to One of the variables in it. 
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Next, with r 2 we will create a label {e 3 (.Yi, .V 2 , A’ 3 ), -'f 2 lA' 3 )} for q. Finally, with r 3 
we will create a label {c i(-V|. AN)} for s. However, considering the constraint label 
of ,s(A\ Y) implies that X = K = T. and therefore .V — Y = Z. Consequently, the 
symbolic derivation of ,s is inconsistent because it contains both e 2 (Z) and ^e 2 ( 7 ). 
However, the EDB-labels computed were all consistent and so we were riot able to 
detect the contradiction. I 

Fortunately, there is an easy fix for this problem. Recall that after computing 
the constraint labels L Sat of nodes in a symbolic derivation tree, the labels are as 
restrictive as possible, and therefore describe all the equality constraints between 
variables. Thus, we can create a new set .of adorned predicates and rules from the 
query-tree that have the constraints completely propagated. Specifically, if g is a 
goal-node of the. predicate p in-the-querv-tree (built with BU Label and TD Label). 
and .Z, la( (<7) = c. .then we create an adorned predicate p c . If r. is a rule-node in the 
query-tree, and we created an adorned predicate p c from its father and predicates 
. 1 1 . . .from its children, then we create the adorned rule 

q? a ... A q c ™ A L,at{r) => p c ._ 

in which none of the positive -literals in the antecedent have -the same variable in 
different columns. We denote the new set of rules by 'TV We note that the rules V\ 
are equivalent to V w.r.t the query q. This means that [V U D) h q(a) if and only 
if there is some c such that {V\ U D) h q c [a). Moreover, when we build a query- 
tree for V\ then .the constraint labels are completely known in the bottom-up phase, 
and therefore, we can compute the EDB-labels in parallel with the constraint labels. 
Returning to our example, we would create the. following rules from the query-tree 
(note that everv.predicate has only one adornment, so we do not .change the predicate 
names): 

r, : e t (.V, V ) A e 2 (Z) A .V = Z = K p(.V.V) 
r 2 : e 3 (.V, K T) A -e 2 (T) A X = T = Y ?(A\ Y. T ) 
r 3 : .V = Y = T A p(X, Y) A q(. V. V, T) =* s(X, V) 

Computing EDB-labels with these rules will result in an inconsistent label for s. 


3.5 Complexity 

As stated .in the outset, the time, complexity of building the query-tree depends on 
the number of different labels that can be attached to the nodes in. the tree.. We have 
seen that the number of labels we may have for L ta 1 with the constraint language 
£ A,V is exponential in t he arity of the predicates. The number of labels we may have 
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for L tnsn or Z.„, v may be doubly exponential in the arity. The following theorem 
shows that we cannot expect to do much better than that. Specifically, it shows 
that once we introduce the predicate the lower bound on the problem of detecting 
strong irrelevance is exponential in the arity. The same is true for encoding L mtn even 
without interpreted predicates. 

Theorem 3.26: Given a set of rules V . a query predicate q, and a rule r £ V, 
deciding $ llr.q.Eij'- is hard for eipojiential time if the rules may contain 

the predicate ^ . 

Deciding $ l (r. q.E-p. D I 2 . M 1 ) , is hard fof exponential time even if V does not 
contain any interpreted _ predicates. 

The proof is based, ori reducing the acceptance problem of a linear-space alternat- 
ing Turing machine (ATM) to. the problem of detecting strong irrelevance of rules. 
The details of the proof are given in Appendix A. 


3.6 Summary 

In this chapter we presented a general method for encoding a set of derivations, there- 
fore enabling us to deduce properties of that set efficiently. Specifically, the method 
enables us to deduce strong irrelevance. claims. The method involves constructing 
a query-tree that finitely, encodes all the possible. derivations. in the given set. The 
key issue in the construction of the query-tree is its- termination condition which is 
based on a labeling scheme we have devised. The labeling scheme depends On the 
specific set of derivations, we wish to encode. VVe have shown three instances of the 
query-tree method: (1) encoding the set of all derivations for Horn rule KBs with 
interpreted predicates, (2) encoding the set of all minimal derivations of a query and 
(3) encoding the set of valid derivations when rules may have negated EDB literals 
in their antecedents. Importantly, in these instances, the number of possible labels 
and therefore the size of the query-tree, does not depend on the number of rules in 
the knowledge base, only on the arity of the predicates. Consequently, the query-tree 
algorithm is likely to Scale up to large knowledge bases. In addition to the three 
instances described, the method provides a powerful conceptual framework in which 
devising new labeling schemes becomes much easier. 

3.6.1 Related Work— - 

The intuition behind the query-tree algorithm comes from translating the problem 
into a decision problem for tree-automata. 10 In fact, we_have argued that a_finite 


10 St-o [Vardi, 1989] for a discussion of the importance of tree automata iti database theory... 
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labeling scheme essentially guarantees that the set of derivations can be recognized 
by a reachability test on a finite tree automaton. With respect to tree automata, the 
contribution of our work is twofold. The first is showing that the problem can be 
recognized by a tree automaton. This involves coming up with the labeling scheme 
(i.e., the states of the automaton), showing that indeed it is sufficient to encode 
precisely the set of symbolic derivations of interest, and showing that examining this 
set of Symbolic derivations is enough to decide irrelevance. The second is developing 
the query-tree which is a more efficient and natural recognizer of the set of symbolic 
derivations. The query-tree essentially combines the creation of the tree automaton 
and the reachability test into. one algorithm. Moreover, the rvery-tree will usually 
produce only a subset of the states of the automaton needed io recognize the set of 
derivations, and working with the query-tree is conceptually simpler than .working 
directly with tree automata. As we see in the next chapter, the query-tree will also 
lend itself to other natural usages. 

Several othe.r authors .have considered static analysis.of 'rules for different pur- 
poses. such as explanation based learning [Etzioni. 1993]., partial .evaluation .of logic - 
programs -(Smith and Hickey. 1990: Lloyd and Shepherdson. 199.1; Bruynooghe et al . . 

199 1] , .automated reasoning [Kowalski. 1975; Bruynooghe et al ., 1989] and deduc- 
tive databases [Srivastava and Ramakrishnan, 1992; UHman, 1989], Some have also 
used graph-like representations of the rules, such as problem space graphs [Etzioni.. 
1993], connection graphs [Kowalski, 1975],. compilation graphs [Bruynooghe et al ., 
1989] and rule./goal graphs [niman, 1989]. Others have used rule folding/unfolding 
in their analysis. 

The key issue common to work that utilizes graph-like representations of rules or 
fold/unfold transformations is when to terminate the creation of the graph (or when 
to stop unfolding the rules). The query-tree is novel in that it gives a well motivated 
termination criterion based on manipulation of the interpreted constraints that ap- 
pear in the rules. Consequently, with the exception of [Srivastava and Ramakrishnan, 

1992] , only the query-tree can be shown to be complete in more than straightforward 
cases (i.e., irt the presence of recursion and constraints). Recall that completeness 
guarantees that the query-tree encodes precisely the set of desired derivations. [Sri- 
vastava and Ramakrishnan. 1992] have a similar result to ours" but only for the 
case of L, at (and not for the case of conjunctive order constraints). Their techniques 
cannot be extended to the cases covered by our general method. 

Another important difference is the size of the query-tree, which depends only on 
the aritv of .the predicates. In contrast, in previous tree-like structures (e.g., [Etzioni, 

1993] ), the termination condition of the tree involves checking whether a node is 
isomorphic to ohe of its ancestors. This leads to a tree_whose_size Can be exponential 
in the number of rules. 


"Obtained simultaneously with ours. 
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Connection graphs [Kowalski. 197o] were also developed for the purpose of focusing 
a theorem prover by precomputing all the possible pairs of resolvable clauses. Clearly, 
if a certain clause appears in a component of the graph that is not connected to the 
component of the negation of the query, it can be removed from the KB (i.e.. it 
is strongly irrelevant). However, connection graphs, only capture a subset of the 
possible dependencies .between clauses. Specifically, they only show that two clauses 
connected to a link are unifiable. but say nothing about the relationship between . 
clauses connected via longer paths in the graph. Other work [Sickel, 1976: Chang, 
1979] has considered following only certain walks on the.graph. however, these walks 

are not guaranteed to encode valid derivations, as are the paths encoded, in.the query- 

tree. 


Chapter 4 


Uses of T; e Query- Tree 


The .querv4ree, as described in the previous chapter, is. a powerful tool for relevance 
reasoning and speeding up inference. -In this chapter "We describe uses of the query- 
tree for these purposes. Section 4.1 describes two uses of ’the query-tree for speeding 
up . inference. .In the first, the query-tree is used to decide which ground formulas are 
Strongly irrelevant to the query. Based .on that determination, we create specialized 
database indices that see only the ground formulas that are (possibly) relevant to a 
class of queries. Using these indices for fetching ground formulas significantLv speeds 
up inference. .The second use of the querv4ree is based on the observation that the 
tree also encodes all the possible sequences of rule- applications and database lookups 
that can result in derivations of the query. We can therefore use the query-tree to 
guide the search of a backward chainer to follow only these sequences. We present 
and analyze experimental results which show, that both these uses. yield significant 
savings in practice. 

Section 4.2 considers the problem of deriving logical conclusions from irrelevance 
claims that are given to the system by an external source. .It describes an algorithm 
based on the query-tree for deriving such conclusions. It also describes an algorithm 
that uses the query-tree to derive logical conclusions from relevance-claims , i.e., claims 
that state that certain formulas are necessarily used in derivations of the query. Fi- 
nally. Section 4.3 describes how the query-tree can be used to extend other query 
evaluation methods. 


4.1 Using The Query- Tree to Speed Up Inference 

Tlie first use of the query-tree is based on the.observat ion. (Corollary 3.7) that it tells 
us exactly which formulas may be relevant-to a query (or set of queries). Specifically, 
a rule is strongly irrelevant to the query if and only if it does not appear in the tree. A 
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gfound formula is strongly irrelevant if and only if ir does not satisfy the constraint- 
label of any goal-node in the tree with which it can be unified. Consequently, when 
answering the query, these rules and ground formulas can be ignored. For instance, in 
the good.PQ.th example (repeated in Figure 4.1). the rule 7*5 can be ignored. Similarlv. 
formulas of the relation Step that do not satisfy {100 < X < Y < 170} can also be 
ignored. 

We can use this property of the query-tree to speed up inference for sets of queries 
that occur frequently. Given such a set of queries, we build a query^tree for it and 
create specialized indices only on .the formulas that are not strongly irrelevant to 
queries in the set. The cost of preprocessing the knowledge base in such a way involves 
the cost of building the query-tree and the cost, of one pass over the knowledge base 
to build the specialized indices. However, the payoff of removing irrelevant formulas 
can be significant because the size of the space that an inference mechanism needs 
to search can be drastically reduced. Specifically, it is guaranteed that every time 
a ground formula is retrieved, the formula may be -part of a derivation .of the query 
since it satisfies the.constraint label of some node in the query-tree. This is especially 
significant when lookups are made with some unbound variables. For instance, in Our 
example, there will be many, lookups of the form step[a, 1 ). where a is some constant 
and I is unbound. I. sing the specialized index on the formulas of the predicate step 
guarantees, that every formula retrieved will satisfy {100 < V' .< 170}. In contrast, 
retrieving a formula that does not satisfy, this constraint. can generate a whole search 
subtree that is guaranteed, to be useless. Note that even if the reasoning mechanism 
detects immediately (by checking the available constraints) that the retrieved formula 
is irrelevant, the cost of doing all the useless lookups and checking the c onstraints 
can be arbitrarily large. 1 .. 

The second use of the query-tree is based on the observation that the tree also 
encodes the sequences of rule applications and database lookups that can result. in 
derivations of the query. We can use this observation la-further control our search. 
To illustrate, consider the following example. 

Example 4.1: Consider a knowledge base defining a relation dessert M.eal with the 
following rules. Its query-tree is shown in Figure 4.2. 

r 1 : cheap.\Ieal(Di, il 1 ) A meat{t)\) A expensive .\Ieal(D 2 , H’j) A. dessert (D?) => 

dessert. \!tal(Di , ll’i , D 2 < II 2 ) 

r 2 .: di$h(X, Z) A (Z < 15) A eompatiblc{X .>'.) => cheapM0al(X..Y) 
r 3 : dish[.\ , Z) A (Z > 15) Aconip(ttible{X.)') =s> expensive XI eal(X.Y) 

: bcef(.\ ) A red\V ine()') => eompatible\ A. V) 
r 5 : desSert{X ) A sweetW itie{)') => cofnpatible(X . V) 

‘Note that in order to detect irrelevant foriiiulas immediately, the reasoning mechanism must 
propagate the constraints in the same fashion done in-crCating the query-tree 


4. t. USIXG THE QUERY-TREE TO SPEED UP ISFERESCE 79 


The knowledge base A consists of the following rules: 

?•) : badPoint(X) A path(.\\ V)_A goodPoint{Y) => gdodPath{X ,Y). 

r 2 : link (.\ , >') ■=> path(X , V). 

r 3 : link(X. Z) A path(Z. V) => path(X.Y). 

r . 4 : stcp{X.Y) => link(X.Y). 

r 5 : bigStep{X.Y) => link(X.Y). 

The following constraints are given on the ground facts: . 

bcldPoint(X) => 100 < A' < 200. 
step{X.Y) => A' < V. 
goodPoint(X) =i> 150 < X < 170. 
bigStcp{X . V) =s A' < 100 A V > 200. 


badPoini[X ) 
(100 < .V < 170} 


{100 < A' < V < 170. Y > 150} 


goodPath(X.Y) { 100 < X < Y < 170, V > 150} 

I 

r i 

path(XX) " ~—goodPoint(Y) 

{100 < .V < V < 170,.V.>J50} {150 < > < 170} 


To 



link(X.Y) , "path(Z.Y) 


{ 10° < A < z < 1? 0} { 100 < Z < r < 170. T > 150} 


JV _ r -l 
1 


■ •i . . 


step(XO') ste P(*- Z ) l 10 ° < V < Z < 170} 
( 100 < .V < V < 170, V > 150} 

Figure 4.1: The query-tree for goodPath. , 


The predicates fneat, beef and dessert are sort predicates ( dessert is disjoint from 
the other two). The relation compatible represents pairs Consisting of a wine and a 
dish that are compatible with each other. The relation dish represents the available 
dishes and their prices. Consider formulas of the relation dish. Any formula that 
satisfies either [betf{D\)f\Z < 15) or (dessert{D 2 ) A Z > 15) may be relevant to the 
query dessertMeal. However, as a subgoal of r 2 . we need only consider formulas of 
dish that satisfy the first constraint, whereas as a subgoal of r 3 , only formulas that 
satisfy the second constraint are needed. Moreover,. the query-tree shows that rule r 4 
can only be applied to a subgoal of r 2 . and not of r 3 (and vice versa for r 5 ), | . 

To exploit this additional control knowledge, we create specialized indices for every 
leaf -in the query-tree and rnodify the inference mechanism to follow, only the paths 
permitted by the query-tree. In our example,, we create one index for beef dishes 
under $15 and another for dessert dishes over that price. To follow the query-tree 
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dcsfitrt M fnl(Di . U’| . D;. H'j) {bee f[D[), dessert ( D ? ) } 


{becf{D \ )} cheap, \{ inl(D \ , IF 



crpeTisiveM eal{Dn . \Yn){dr!>s'*rt{ Do) } 


rh 


{beef(D \ ), Z '< \o}dish(D\ , Z) compat iblr[ D\ , \\\ ) d ish (D-j. Z\) cdmpatibl ({D?. U ;) 

! {dessert[D?). Z\ > 15} | 

r A r 5 

/X 

beef{D\)rcd\\ i'nr(Ui) Hcsscrt[ Dn ) sweet U frifftt'a) 


Figure 4.2: Avoiding search paths using the quer y- tree 


during inference, we attach to every suhgoal n in our search a node in the- query-tree 
d>(»). We start by assigning the. root of .the query-tree to the query. At every step, 
if n is a database lookup subgoal (i.e.. a subgoal of an EDB predicate), we perform 
the. lookup using the specialized index of o(n). Otherwise, we expand n oniy with 
the. rules that are children of the expanded equivalent of d>(n). 2 . We assign to. the 
subgoals of 7i the appropriate subgoals of the rule-node in the query-tree. As a result, 
the inference engine follows only the paths encoded by the queryrtree and in every 
database lookup it retrieves only ground formulas that can be used in derivations in 
the current path. 

4.1.1 Experimental Results 

The impact of the savings achieved by using the query-tree were tested using a depth 
first search backward chainer on Horn rules. 3 Given a knowledge base A and a query 
schema q (i.e., a query with free variables), we built a query-tree for q and two sets 
of indices on ground formulas in A. The first set T\ included an index on every 
relation that includes only the formulas that were deemed not strongly irrelevant by 

the query-tree. Specifically, a ground formula c(«t ,<7 n ) is included in the index 

for the relation e in 1\ if d| satisfies the constraint label.of some leaf of the 

2 \Vhtch may be the node 6(n) itself. 

3 Thc performance of the backward chainer compared favorably with that of hpikit (a commercial 
miplemelilation of MRS (Russell, 1985)), Furthermore, the speedups attained by removing irrelevant 
formulas (BC2 below) Were also tested using the backward chainer of Epikit and the speedups 
attained were even better than those reported here. In the experiments we tested several rule and 
goal orderings. Tht ; results are shown for the ordering that yielded the best results consistently for 
dll three versions of the backward chainer 
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predicate e in the query-tree/ The second set of indices Z> included one index for 
every EDB leaf in the query-tree, We measured three running timesn 

• BCl - the backward chainer on A using the original indices in the KB. 

• BC2 - the backward chainer on A using the indices 1\. i.e,, ignoring strongly 
irrelevant formulas. 

• BC3 - A backward chainer that uses the indices I 2 and only follows the paths 
allowed, by the query-tree. 

We tested over 20 query. schemas taken from the following four domains: 

L A travel domain using a database of real airline data describing flights between 
cities in theJL'.S (examples 3=6 in the tables), 

2. A wine domain consisting of a. knowledge base of 50 rules describing various 
wines and dishes and compatibilities between them (based in part- on [Ftombaucr 
and Rombauer-Becker._I975])- (examples 7-8). 

3. A student-advisor domain using a knowledge base about computer science Ph.D 
graduates, including advisor, school and. graduation dates (examples 9-10). 

4. The gaodPath example, using the rules -in Exarnple.4.1 (examples 1-2). 

The first and fourth domains usually yield deep recursive search. trees, even though 
the number of rules is small. The second domain is .non-recursive and yields shallow 
but bushy (i.e., large branching factor) search trees. In the third domain, search trees 
have a low branching factor (which Was from student to advisor). 

Table 4,1 presents the results of the experiments for the case where we are looking 
for all solutions to a query (e.g., find all A*. V such that goddPath(X , V) is derivable). 
In the table, Filtering Time includes the time taken to build the query-tree and 
create all the indices (both T\ and Jj). Percent irrelevant is the percent of ground 
formulas in the knowledge base that were deemed strongly irrelevant (and therefore 
not included in Z\). The next columns show the time taken to finci all the solutions 
to-the query. The respective running times of BCl, BC2 and BC3 are shown, as well 
as the ratios of running times. In addition to measuring funning times, the number 
Of nodes expanded in the search was also counted, The last two columns show the 
ratios of the number of nodes expanded by BC2 and BC3 compared to BCl. 

The results show significant speedups for both BC2 and BC3. For BG2, the speedups 
Were usually in excess of a factor of 3.. ranging up. to 31 (mean: 10.4). The results 


4 Notr that the original knowledge b<w had an index for each ground relation 
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show that by following the query-tree using BC3 we often get additional improvements. 
Hie speedups of BC3 over BC1 were usually in excess of 5. ranging up to 1 190 ,meari: 
41. excluding example #6). 5 In ttitns of nodes expanded, the average speedup for 
BC2 was 10. while the average speedup for BC3 was 37 (excluding example #6). The 
results clearly show that if we are looking for all solutions to the query, building the 
query-tree and the specialized indices will yield significant savings. 
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Table 4.1: Experimental results: finding all solutions. 


Tahle 4 . 2 presents the results of the experiments for the case in which we use the 
query-tree built for a query schema to solve ground queries or to find the first solution 
to a query with free variables (i.e., the query-tree was built for gbodPath{X,Y) and 
the query is good Pat h(\30, 160), or we are trving to find the first binding for .V 
arid The second and third colurrins show the ratios -of the number Of nodes 
expanded for ground queries. The next columns show the node ratios of finding the 
first solution to the query. The next column compares the preprocessing time and 
the time to find solutions to the query. It shows the number of calls (each looking 
for the next solution) after which the preprocessing time equals the time to answer 
the queries. The last column shows the number of solutions found for the query, The 
results indicate that often the preprocessing pays off after a very small number of 
solutions and therefore it is beneficial to build a query-tree even in cases when we are 
searching for few solutions. 


4.1.2 Analysis. 

1 he experiments showed that the savings achieved by using the query-tree arc affected 
by several factors. In. this section we describe these effects. 

& Example #6 was excluded from the mean because the. speed ups it yielded were exceptionally 
high 
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Table 4.2: Experimental results: ground queries and finding the first solution. 


Percent of Irrelevant. Formulas 

The. analysis of the algorithm suggest that the speedups obtained will be significantly 
affected by the percent of. formulas in .the. knowledge base that are found to be irrel- 
evant to the query. To test this effect, we fan several variants of each example, that 
differed only in the constants appearing in the rules (which had the effect of .varying 
the percent of .irrelevant formulas). The results, shown in Table 4.3, show .that the 
speedups grow significantly as the percent of irrelevant formulas increases. For ex-, 
ample, when 90% of the facts are found to be strongly irrelevant, we get speedups 
greater than a factor of 100. 

It is important to note that we have the flexibility of building a query-tree at differ- 
ent levels of generality and thereby to achieve varying percents of irrelevant formulas. 
For example, instead of building a query-tree for the query schema gebdPath(X,Y )., 
we can build one for' goodPath{l2(i, Y). Doing so will result in deeming additional 
formulas irrelevant (e.g., sfep(A\K) for 100 < .V < 120 in this case). However, the 
indices created by this query-tree will be usable for a smaller set of queries. Con- 
sequently, in using the query-tree one should attempt to identify the most accurate 
characterizations of frequently occurring sets of queries. 
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Table 4.3: Changing the percent of irrelevant formulas 
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The Number of Ground Formulas in the Knowledge Base 

The second factor that affects the speedups- that was suggested by the initial results 
is the number of ground formulas in the original knowledge base. To test this effect, 
we ran .each of the examples with databases containing a different number of ground 
formulas. The results, shown in Table 4.4. show that the speedups increased as the 
size of the databases grew, even if the percent of irrelevant formulas remained roughly 
the same. The growth can be explained by the fact. that the cost of backward chaining 
is more than linear in the number of formulas. Therefore, the effect of removing some 
constant percent of formulas will be greater when the overall number of. formulas is 
greater. These results are significant in that they suggest that our methods will scale 
up to large knowledge bases and be even more effective there (recall that the. cost of 
building the query-tree is independent of. the number of ground formulas). _ 
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Table 4.4: Changing the size of the database. 


Placement of Interpreted Literals in the Rules 

A final factor in the speedups achieved from using the query-tree is the way the 
interpreted literals are placed in the rules. To illustrate, consider the following set of. 
rules defining the existence of a flight (perhaps with stops) between two cities in the 
country subject to time constraints (given by the constants s 0 and eo): 

til : p[X, V, Si, Ei) A {sq < St) A (eo > Ei) =>• timelyCdnntct{X, Y) 

: //(A*, V, E) => p(A\ Y, S, E) 
u 3 ; f l( A', Z.S.T) A (T < T,) A p(Z, V, 7\, E) => p( X, Y.S . E) 

To describe such paths, the rules can also be written as follow's: 

t'l : p[X , V, S|, Ei) => tiinclyCdnnect(X , V) 

i ’2 : //( X, V, S. E) A (5 > s 0 ) A (E < to) => />( -V . V. 5. E) 

t- 3 : fl( X, Z . S..T) A (7 £T,) A (.S’ > *„) A (T < c 0 ) A p(Z, >* T U E) => p(X, Y, S> E) 

The difference between the two sets of rules is that the. second set is crafted 
to exploit the constraints entailed by the interpreted constraints on timelyC onnect . 
Specifically, whenever We retrieve a flight formula that violates the constraints (i.e., 
ends later than e 0 or begins before ,«. () ), we will immediately backtrack, In contrast, 
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when using the first set of. rules, we will compute all possible paths (in a bottom up 
computation) and check the constraints in the last step of the derivation. Conse- 
quently, when using the first set of rules, a strongly irrelevant formula may be the. 
root of an arbitrarily large tree, whereas when using the second set, no such tree 
will be generated. Consequently, removing strongly irrelevant formulas will have a 
greater effect for a set of rules like the. first one. The experimental results confirm 
this observation. The example pairs 1 & 2 and 3 4 are instances of rules differing 

exactly in this fashion. 

Several points should be noted with respect to this issue: 

• Although the speedups are significantly bigger using the first set of rules in each 
pair, we still achieve Significant savings even when the rules are carefully crafted 
such .that. the constraints are used to control the search. 

• Writing rules with such built-in control. has many disadvantages ([Clancev. 
1983]).- It is extremely .difficult to write such rules in .practice and is a very 
error-prone task. Consequently, we expect rules would usually be written with- 
out such crafting. 

• Crafting a .set of rules with such built-in control can however be done easily 
using the query-tree (as we did in Section 3.4.). Specifically, we can .create a . 
new rule for every rule-node in the query-tree that includes the. constraints .of 
that node. The resulting set of rules will be equivalent to the original set with, 
respect to the query predicate (i.e., will produce the same answer regardless 
of the database of ground facts). However, using the new set, the tightest 
constraints will be enforced on the bindings immediately when they appear. 


Applicability to Other Inference Mechanisms 

The experiments described above were done with a depth first search backward chain- 
ing inference mechanism. However, the techniques we described can be applied to a 
wide range of masoning mechanisms. The first use of the tree, the removal of strongly 
irrelevant formulas, is independent of the reasoning Scheme used. Following the query- 
tree can also be integrated easily to any reasoning mechanism. The only requirement 
is that nodes in the search space be associated with nodes in the query-tree, and the 
particular order in which the Space is searched is unimportant. 

Finally, there are several- possible schemes for integrating the construction .of the 
query-tree and the indices with search for the solution. One possibility is to create 
a specialized index for a relation only if it is actually referenced in. the search. This 
way one can avoid creating indices that will not be used. 
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4.2 Irrelevance Claims from an External Source 

Until now we. have used the query-tree to decide automatically Which formulas are 
irrelevant to a given query. Often a user may be able to supply the system with 
additional irrelevance claims based on his/her knowledge about the domain and about 
the ground formulas in the KB (or those that may appear in the KB). Specifically, 
the user may know that a set of formulas 4> is strongly irrelevant to the query q , given 
the possible ground formulas that may occur in the knowledge base. This knowledge 
may not b£ expressible as explicit constraints on ground formulas, which can be used 
directly by the query-tree. For example, this knowledge may be based on the fact 
that the join of two relations is empty, which is not expressible using Horn rules. 
Alternatively, this knowledge may be heuristic in nature. 

Clearly, if we are told .that a formula o is strongly irrelevant to q . we-can ignore 
O when answering q. However, we may also be able to conclude that other formulas 
are irrelevant as well. This Section describes an algorithm. for deriving such conclu- 
sions using the query-tree. In Section 4.2.1, we consider a different kind of external 
knowledge in which the system is told that some formulas are necessarily relevant to 
the query. 

Formally, the problem we consider here is as follows. Suppose that V is a set of 
rules and let / be an irrelevance Claim stating that a set of formulas $ is strongly 
irrelevant to a query q. More precisely. / actually states that the set of possible KBs 
is some subset S' such that 5/($, <£, SU DI 2 < £>,) holds. 6 “We assume that 0 

is composed of. a set of rules $ r C V and a set of ground, formulas 4> s specified. as a 
set (p(.V) | C(A')}, where p is some IDB predicate 7 and C{X) is a formula with only 
interpreted predicates. 8 Our goal is to find which strong irrelevance claims follow 
from /., i.e.. for which formulas Oi.the following holds: 

57($, q, =t> Sl(o l .q.E\DI 2 ,V q ). 

To derive logical conclusions from / using the query-tree, our strategy is to Create 
a set of rules V \ . such that when formulas from are excluded. V\ and V produce 
the same derivations of the query q for every set of ground facts G . We then Create a 
query-tree for V i and find all formulas that are strongly irrelevant to q. If a formula <£q 
is found to be strongly irrelevant with respect to V\ , then it is also Strongly irrelevant 
with respect to V whenever / holds. 

'’Note that there may be many such subsets . In our algorithm and analysis we will assume 
that S' the maximal such subset, but our conclusions will hold for any such subset. 

1 Note that the case of EDB predicates can be handled in a straightforward fashion by the query- 
tree algorithm. 

8< t ran include a collection of such sets. However, for simplicity of exposition we assume that 
there is only one 
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Formally, given that 0 = $ r U {/>(A) | C(A)}, the set of rules Vi is defined as 
follows: 

1. If r 6 V and.r £ 4> r and the head of r is not p, then r € Pi. 

2. If r € V and r is of the form < 7 t(A'i ) A . . . qi(Xi) => p(X), r g and 

-C(A') = Z? 1 (.V)V...Vd m (.V), 

where each d , is a Conjunction of literals of interpreted predicates; then V\ 
includes rules of the form 


< 7 i.(A'i) A , ..qi{X,) A d,(A) => p(A) 

for (1 < i < m). For future reference we denote these rules by P](r). 

Example 4.2: Suppose that we are told that the set $ — {path(X, V’) | .Y < 120} 
is strongly irrelevant to the query gobdPath in Figure 4.1. The rules in V\ would 
include rules r^. r 4 and r 5 , as well as the following rules for the predicate path: 

link{. Y, V) A (A' > 120) path{X,Y). 

Unk(XpZ) A (A' > 120) A path{ZJ') pgth{X,Y). 

The query-tree for Pi will, show that the formulas {badPoint(X) | A < 120} are 
strongly, irrelevant to goo_dPath(X,Y). I 

To prove the Correctness of our algorithm, we show that Pi produces precisely the 
same derivations as those produced by P, except for derivations including formulas 
in 

Lemma 4.3: Let P be a 'set of Horn rules and let V\ be. the set of rules produced 
by our algorithm, given that the formulas $ are irrelevant to the query. Let D be an 
arbitrary set of ground facts. 

A derivation d that does not use formulas from $ is a. valid derivation of the query 
from DUP if and only if there is a valid derivation d' of the query from D UPi, such 
that the only difference between d and d' is that every instance of a rule r of p used 
in d is replaced in d' by an instance of a rule in Pj(r). 

Proof: Let d' be a derivation of the query from D U TV Clearly, if we replaced each 
of the rules of p used in d' by its original rule in P, the resulting derivation would be 
a valid derivation of the query from P because the original rule does not contain the 
additional literal of the interpreted predicate (d,) id the antecedent. 

Conversely, let d be a derivation of the query from D UP. Since d does not use 
formulas from it does not include rules from $ r , To complete the proof, suppose 
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the formula p(a) is used as a part of the derivation d and was derived using the 
instantiated rule A ...<?/(«/),=£■ ;>(a). Since p[a) 4> s , there must.be some j, 

such that a satisfies d t . Therefore. />(«) can also be derived using the instantiated 
rule gi(di) A .. .qi(ai) A d,(a ) =$> p{a). which is an instance .of a rule in V v 

Consequently, every rule of p in d can be replaced by a rule of P u and so we can 
construct a derivation for the query from the rules of Vi. I 

The following corollary shows that inferring strong irrelevance claims from the 
query-tree of V\ is sound: 

Corollary 4.4: Let P\ be the set of rules constructed from P and the irrelevance 
claim I stating that 4> is Strongly irrelevant, he., I = SI{$.q, If d>\ is a 

grou nd atomic f ormula and SI{0 u q, DI 2 ,V q ) holds , then 

I 57(Oi,q.E'. D/ 2 ,P v ) 


holds. If Oi is a rule, and for all rCe P\{.6), S'I(r\ q, Ip, , DI 2 , T> q ) holds, th 


en 


I =» SI{o u q,T'.DI 2 .V q ) 


holds. 


Proof: By Lemma 4.3, a formula 0 { can be used in a derivation of the query q from 
P and some database D if and only if either: 

1. <p\ is used in a~derivation of q from V U D that includes some formula from $ 
or 

2. Q\ is a ground atomic formula and can be used in a derivation, of q from P x U D 
or 

3. d>\ is a rule and there is some r' € ”Pi(d>i) that can be used in a derivation of q 
from V\ U D-. 

Consequently, if S' is a set of databases in which the formulas 4* are strongly irrelevant 
to q, then *he first possibility is ruled out, In the case of d> x being a ground atomic 
formula, since Sl(6\,q, £**,, DI 2 ,T> q ) holds, the second possibility is ruled out, arid if 
d>i is a rule, then because all the rules .ih P x (<p x ) are hot in the query-tree, the third 
possibility is ruled out. Consequently, the corollary holds. | 

The same algorithm can be used to derive logical conclusions from external weak 
irrelevance claims, 
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Corollary 4.5: Ld T\ be the set of rules constructed from V and the set of irrelevant 
formulas $. Suppose l w = U7($, q, £', DI 2 . V q ). If d> { is a ground, atomic formula 
and S /(<?i. g. Dpi . D / 2 . ) holds, thefi 

I =* Wn^q.E'.Dh,®,) 

holds. If <t>\ is a rule, and for all r' € V\ (<?), 5/(r'. 9 , Dp 1 . DI 2 .T> q ) holds, then 

i => \vi(ouq,z',nr 7 ,v q ) 


holds. 

Proof: Suppose I w holds. This implies that for any database D € S', any answer 
to the query has a derivation d that does not use formulas in 4*. Therefore, by 
Lemma.4.3, there will be a derivation d 1 of the query from V\0 D corresponding to 
d. If <hi is a ground formula, then since SI{d>i,q, Sp, , DI 2 ,T> q ) holds, then d! does not 
use <j>i, and therefore d does not use <p\ (because d'. contains a superset. of the ground 
formulas in d).. Similarly, if e > 1 is a rule, then since Sl[r\<q, St?,, holds for 

every rule r' in V\[<t> 1 )., d does not use <Pi, I 

Our. inference procedure is mot complete. .In fact, in general, it is not possible to 
find all the consequences of an irrelevance claim /, even if includes a single rule. 

Theorem 4.6: Suppose that V is a set of function-free Horn rules uiith.no interpreted 
predicates. Let I be the irrelevance claim stating that a rule r E'P is strongly irrelevant 
to the qutry q, i.e., the set of possible KBs is S', where S' is the maximal subset of 
Sp such that SI(r,q,H\ Dl 2 ,T> q ) holds. There is no algorithm that will determine 
whether SI{d>uq,Si', Dli,V q ) holds for an arbitrary formula d>\. . 

The proof is based on a reduction from the rule redundancy problem, and its 
details are given in Appendix A. 

Example 4.7: As an example of a knowledge base for which our algorithm will not 
find all the conclusions of /, consider the following rules; 

1 e(A) =*• Pi (A') 

*3 : e(A) ^ pi(A) 

«3 : Pi (A) => p(A) 

•s< : pj(A) => p( A ) 

Suppose / states that rule Si is strongly irrelevant to p. Our algorithm will build a 
query-tree that will consist of the rules S 2 and S 4 and will deem S 3 to be strongly 
irrelevant. However, since the derivations using Si are isomorphic to the derivations 
using s 2 , strong irrelevance of si implies strong irrelevance of $ 2 and s< as vveii. I 
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As can be seen in this example, we can combine our algorithm with algorithms for 
deriving weak irrelevance claims (see Chapter 5), and the resulting algorithm will be 
able to derive additional irrelevance claims. Furthermore, using the results described 
in Sections 3.3 and 3.4, we can also use the query-tree to derive additional strong 
irrelevance claims by looking only at minimal derivations and in knowledge bases 
that include some forrns of negation in their antecedents. 

4.2.1 Relevance Claims 

A different kind of knowledge that a user may be able to provide a system is positive 
relevance knowledge. For example, the user may know that a certain formula must 
be used in ever.y possible derivation of the query. Such knowledge may be available 
in several contexts.. For example, (as we often see in. textbooks), we may be given 
a hint that a certain lemma must be used in a proof of a theorem. As another 
example, suppose a new. .formula is added to a knowledge base, and we want to find 
the new derivable conclusions. In such a case, we know that the derivations, of the new 
conclusions must include the updated formula. As in the case of external irrelevance 
claims, we may be able to use a relevance claim in order to deduce that some other 
formulas are irrelevant to the query. In this section we show how to. use the query-tree 
to.deduce such conclusions. In theory, it is. possible .to construct a space of definitions - 
for. relevance analogous to the space we constructed for irrelevance. However, here we 
consider only one such definition: . 

Definition 4.8: A formula cf> is relevant to a query q with respect to a set of knowledge 
bases E, denoted. Relevant[cf), q, E), if <p appears in every derivation of q from each of 
the KBs in E. I 

To derive irrelevance-claims that are logical consequences of a given relevance- 
claim we rely on the following observation. If a formula (j>\ cannot appear in any 
derivation that includes d>, and <f> is known to be relevant to the query q, then 4>\ 
must be strongly irrelevant to q. The querw-tree enables us to find such relations . 
between formulas in the knowledge base, formalized by the following exclusiveness 
condition: 

Definition 4.9: Two lormulas <pi and <t > 2 are said to be exclusive with respect to a 
set of rules V if there is no set of ground formulas G such that there is a derivation 
of an answer to the query from VUG that uses both <j> x and ,<p 2 . | 

<Pi and $2 being exclusive is a Sufficient condition for. deriving strong irrelevance. 
The following lemma follows from the definitions: 
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Lemma 4.10: IfT' is a set of databases such that Relevant[oi.q.'E') holds, and 0\ 
and 02 are exclusive, then S l{Oi, q.E', holds. 

Proof: If 0 \ appears in. every derivation of q from databases in X 1 . and 0 \ and 02 
cannot appear .in the same derivation, then 02 does not appear in any derivation of q 
from databases in L'. I 

The exclusiveness condition tan be determined using the query-tree. Figure 4.3 
describes an algorithm that finds all the formulas $ such that $ and r are exclusive 
with respect to V. where r is a rule . 9 Informally, the algorithm begins from every 
appearance, 7*0, of r in the query-tree and marks all the nodes .that can appear in a 
derivation together with ro. It labels above any node that can appear above ro in a 
derivation tree, and labels belou' any node that can appear in .such a tree, but not 
necessarily above r 0 . The .correctness of the algorithm is established by the following 
theorem. 


procedure fmd-exclusivc-formulas(To, r) 
begin /* To is the query-tree for. the rules V. */ 
for every appearance, r 0 € To of r do: 
repeat 

label ro above and .below. 

1: if a rule-node -n is marked above, label its father goal-node above. 

2: if a goal-node n.is marked above. 

then label its father above and its siblings below. 

3: if a goal-node n is labeled above, label above any. of its Unexpanded equivalents. 

4: if n is marked below label its children below. 

5: if a goal-node n is marked below and m is its expanded equivalent, label, m as below. 
until no new nodes are marked. 

Any node that has an instance marked above or below is marked non-exdustve. 

Remove all above and below markings, 
end for 
end. 


Figure 4 . 3 : Alg orithm for fi nding exclusive rules. 


Theorem 4.11; Given a set of rules V, arquery q and a rule r € V, procedure find- 
exclusive formulas will mark an instance of rule s in the query-tree if and only if S 
is not exclusive with tespect to r. The rule r and the atom p(a 1 ,. . ,a n ) are exclusive 
if a i,.,,,a n does not Satisfy the constraint label of any node of p that was marked 
noh-efclusive.. 

s The algorithnt can be extended in a straightforward manner to the .case where r is a set of 
formulas. 
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Proof: We prove the -“only if” part by showing that if a node v was marked Jion- 
cxclusive by the algorithm when the marking began with an appearance ?'o of r. then 
there is a symbolic derivation tree d encoded by the query-tree (by a mapping v) 
such that there are two nodes n\ and 7? 2 in d such, that r(»i) — ft and r( ? ? 2 ) = r. 

We prove the claim by induction on the order of the marked goal-nodes. Specif- 
ically. we show that for every node n that is marked there is a partial symbolic 
derivation tree d 1 , 10 encoded by the query-tree using a mapping U' such that: 

• If a node ?? is .marked above, then n is v(r°ot(d')) and r appears in d ' . 

• If a node n is marked beloU\ then, there is a leaf in d\ n\ such that v{n') = ri, 
and d' includes r. 

As a consequence-of these claims, we can show that an appropriate symbolic derivation 
tree d exists.-Given the partial tree d\ we Considei a symbolic derivation tree encoded 

by the query-tree in which one of the nodes n is mapped to v(root(d')). We replace 

the- subtree of n with d'\ Arid complete the leaves of the resulting symbolic derivation 
arbitrarily (note that Part 2 of Theorem 3.6 guarantees that the completion can be 
done). The resulting symbolic derivation is encoded by the .query-tree and satisfies 
the requirements.. 

The claim holds trivially for the base case that. includes the node 7- 0 and its father 
and children, since (by Part 3 of Theorem 3.6) there is a symbolic. derivation tree 
encoded by the query-tree that includes r 0 . The derivation tree d' .includes the rule- 
node in r, its father and children. In the. inductive case, there are several cases in 
which a node m could have been, marked, corresponding to the conditionals in. the 
algorithm: * 

1. The node m i r a father rule-node of n (case 2), Bv the inductive assumption, the 

node n is a root of. d' that includes r. Consider a partial symbolic derivation 
tree d\ created by adding the rule in m as the father of the root of d'.. The 
mapping of the nodes-in a" stays unchanged. The root of d\ is mapped to the 
father of m (covering Case 1 of the algorithm) and the top rulemode is mapped . 
to rn. The siblings of n are leaves in d j and they are mapped to the siblings of 
k'{rodt{d i)). 

2, The node m is an expanded equivalent of a goal-tiude n (case. 3). Since there 
exists a partial symbolic derivation d' for which X‘{rbot(d')) = n, we can just as- 
well. make c’( ?1 oo<.(di)) •— ni, and the encoding conditions of Definition 3.1 still 
hold. 

10 The derivation d' is partial because its leaves are rio< necessarily EDB nodes .and its root is . 
mapped to an arbitrary goal-node in the query-tree, hot necessarily a root. 
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3. In case 4 of the algorithm, if v is marked below, then by the inductive assump- 
tion, then there is a leaf n' in r/ ; such that v(?d) = n. Therefore, if r 1 is a child 
rule-node of We can Consider a partial symbolic derivation d i in w'hich n' is 
expanded with the rule in r'. The mapping w will map the child of n' to r' and 
the children rule-nodes to the corresponding children of r'. 

4. In case 5 of the algorithm, the same argument of case 4 holds'. Consider the 
derivation .d' in which n' is expanded with bne.of the children of its expanded 
equivalent, m. Modify c so that 0{n') — m. 

To complete the proof, we must show that if a node n can appear with the rule f Q in. 
a symbolic derivation, then it will be marked by the algorithm. Let d be a symbolic 
derivation encoded by the query-tree and v‘ be the mapping of the nodes of .d to the. 
nodes in the query-tree.. Assume that there is some node r' £ d such that d'[ r> ) — no. 
We need to show that for every n 6 d. U'(n) will be marked. by. the algorithm. Since 
Co is marked belou\ all the nodes n € d that are. below r' will be marked below by a 
Combination of conditionals 4 and 5. Let - be the father goal-node of r'. The node 
t’(m) will be marked either by 1 or by 3, The father node of i/’(m) will be marked 
above by 2, and v{m)'s siblings will .be marked below by 2. Consequently, if r' t is the 
father of m. then for any node.n in its subtree, V(n) will be marked by the algorithm. 
We can continue in the same fashion for r{, and show that i/»(n ) will be marked for 
every node n that is a descendent of the top rule-node in d. Fina lly, t b(root(d)) will 
be marked-by 1. fl 

Example 4.12: Consider the following rules: 

r i : iuants[X,YC) A canAf ford{XA ,C) =*> buys{X<YC) 

f 2 : wants[X,Y,C) t\ cafiGetLoan{X,C\) A (Cj > C) =4> buys{X, Y,C) 

r.i : 5ee.s(.Y, V. C) A likes(X.Y) =4> wants(XA\C) 

U ■ priceOf(YC) A basCash(X,C) A [C < 100) £> canAf ford[X,Y,C) 

?‘s.: customer( X, B) A credit Limit(X, B,C) => canGet Ldan(X ,C) 

The atoni buys[X,YC) denotes that A' will buy item V’ at price C. The person A' 
will buy V* only if she wants it. If she docs, she will buy it if she has enough cash 
at hand of if she can get a loan from a bank to cover the expense. The query-tree 
constructed for b'uys{X,Y,C) appears in Figure 4.4. In this example, rules r* and 
As arc) exclusive. Therefore, suppose we are given that f 4 must be. used to answer 
the query buys{.Sue, )\Z) (because we know that Sue used cash for all her purchases 
lately). The algorithm .will .mark the nodes in the query-tree as shown .in. Figure 4.4, 
and therefore the rules f- t and f 2 will be deemed strongly irrelevant to the query. I 
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buuslX. V. n 

{Cl > (*} r 2 


uants(X. V, C) 


canCiet L'oan[X ,C\) 


*•3 


n 


/\ 


sies(.\, Y.C) likcs(X.Y) 


/\ 

custofner(X.B) credit Linut[X, B, C i ) 


{C < 100} buys(X. Y, C) {abo^e} 


ri {g6oir} 


{belong wants(X .Y..C) 


canAfford{ X.Y.C){ abpve}_ 


{beloiv} r3 

secs(.Y.V.C) likes(X, Y 
{ below) {below} 


r* {above, below} 

. /N 

priceOf{),C) hasCash{\ ,C) 
{ below] { below 1 } 


Figure 4.4: The query-tree of Example 4.12. 

4.3 Additional Uses of the Query-Tree 

In this section we briefly outline several ways in which the query-tree can be used to 
extend other query evaluation methods. 

Combining with Magic Set Transformation 

two primary strategies for evaluating a query with .a given set of rules are top- 
down (e.g., backward chaining) and bottom-up (e.g., forward chaining). Top-doWn 
techniques have tlk. advantage that they are. more goal-directed, since they exploit 
the information in the query (e.g.. the query-predicate and bindings that appear in 
the query). However, top-down techniques have the disadvantage that they niay- 
result in infinite loops and that they require the Operation of unification, instead of 
the cheaper operation of term-matching, used iii bottom-up evaluation. In contrast, 
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bottom-up techniques will not get into infinite loops (when the rules do not have 
function symbols), but may compute many facts ihat are not relevant to the query. 
The goal of the magic-set transformation method (Ullman. 1989] is to combine the 
advantages of top-down and bottom-up evaluation methods. It transforms a given set 
of rules V to a new set V\, such that V\ is equivalent to V with respect to the query 
predicate. Moreover, a bottom up evaluation of V\ provides the goal-directedness 
focus achieved by top-down evaluation. To illustrate, consider the following rules for 
transitive closure with additional interpreted constraints: 

r i : c(A\ Y) A (.V < V) => p( A'. V) 

rj : e( A', Z) A (A < Z.) A p(Z. Y) ^ p(. Y. V) 

and .suppose our query is to find. all Y such that p{a,Y) is derivable, where a is some 
constant. The transformed set of rules will be the following: 

•si i iri p [(i). 

i rn P (X) A r(.Y, Z 1 A (.Y < Z) =t> . m p {Z ) 
s, nn„(X) A <(A\ Z) A (.Y < Z) A p(Z, Y) ** p{ A'. V.) 

: m p (.Y)A4A , .Y) A (.Y < Y) s* p(X\Y) 

The predicate-nip is the “magic predicate of p" and is used to constrain the tuples 
that will be. computed in the bottormup. evaluation of the rules. Essentially, m v is 
the set of constants that may appear in the first argument of facts of p that are used 
hu a derivation of p(a,V). 

The limitation of the magic-set transformation is that it can only use binding 
information in the query. In the above example, it was able to use the fact that the first 
argument of p in the query is bound to a. However, it cannot use information about 
constraints on possible bindings. For example, if the query was p(a,Y) A (Y < 2), 
the magic set transformation would not be able to exploit the constraint V' < 2. 

The query-tree effectively pushes such constraints from the query to the other 
rules of the knowledge base. Consequently, we can use it in order to extend the magic 
set transformation with constraint pushing. Specifically, we can attach bf adornments 
to labels of the nodes in the query-tree. The adornment specifies which argumetits 
of the node are free and which are bound. Adornments can be determined in a top 1 
down fashion as in the magic-set transformation algorithm. We can then refine the 
equivalence relation on nodes in the tree by requiring that the adornments be the 
same. The resulting rules will have the unique -binding property needed in order to 
create a magic program (sen (I’llman. 1989). Algorithm 13.1, pg. 828). Applying the 
magic-sets rule transformation to this set of rules wilt yield the following rules for tlie 
query p(a,)') A {Y < 2): 
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: m p (a). 

*' 2 : m r {X) A Z) A (A* < Z < 2) =J> m P (Z) 

4 : m p (.V) A e(A\Z) A (.V < Z < 2) A />(Z. V) => p(.X.Y) 

*4 : w p (.V) A e(.Y.V’) A (.V < Y < 2) =* p(X.Y) 

Note that with these rules, facts of the form p(. Y. Y) with V > 2 will not be produced 

in a bottom-up computation. Since the rules created by the query-tree are equivalent 
to the original rules with respect to the query, it follow's from Theorem 13.1 in [I'llman. 

1989] that our transformation is correct. 

It should be noted that using the query-tree to propagate the constraints lias an 
advantage over previous techniques, such as the use of bef adornments [Munnick et a/., 

1990] . That technique attaches a c adornment to an argument of a goal-node if there 
is some known constraint .on it. In contrast, the query-tree .considers the semantics 
of the interpreted literals to compute the actual constraint on the arguments. 

Message-Passing Query Evaluation Schemes 

In a message passing scheme for query evaluation [Van-Gelder, 1986], query. evaluation 
is viewed as a system of cooperating processes communicating.by message passing. 
Each process computes some set of tuples (essentially a subset cf the relation for some 
relation). The. messages between the processes represent the needs. of a. process arid 
the solutions itgenerates. A need message is generated by a process that needs some 
relation in order to compute its output relation. For example, a process computing the 
relation ei(l,V. ) t*J e 2 (V, Z) will send a message to a process computing the relation 
Cj, specifying that it needs the subset of ei with the first argument bound to l, After 
Computing the desired relation, the latter process will send a solution message, with 
the possible bindings of Y. 

The main advantage of a message passing scheme is that by breaking up the 
problem to such modules with well defined interfaces, we are able to exploit existing 
operating system features in order to facilitate and speed query evaluation. Such 
features include scheduling, message passing, and .multi-tasking. This scheme is also 
a first step towards parallel implementation of query evaluation. 

To facilitate such art evaluation scheme, we need to determine what processes will 
exist and how they will communicate. To do so, Van Gelder uses a rule-goal tree 
which resembles a simple version of the query-tree. Each goal-node in the tree is 
considered to be a different process in the system; The termination condition that he 
uses in constructing the rule-goal tree depends only on an isomorphism of the variable 
patterns of the goal-nodes and of. the adornments. Consequently, information about 
interpreted literals is not propagated and therefore not used to determine the set of 
processes. 
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The query-tree can he used directly to extend \ an Gelder's scheme to incorporate 
knowledge about interpreted literals. We can simply refine the labels of the nodes 
in the query-tree with the adornments used in his construction. As a result, we will 
be able to distinguish between parts of a relation that can be computed in parallel 
and are independent of each other. For instance, consider the rules of Example 4.1 
repeated below. 

r l : chc&p.Mcal{Di. M‘ f ) A mcat(D l ) A < f finis ive Meal (D' 2 . IVj) A desSert{D 2 ) => 
dessert Meal(D \ , ir t . D 2 . lf 2 ) 

r 2 : dish(X , Z) A {Z < 15) A cofnpatiblt(X . } ) => eheap.Meal(X . V) 

: dish(X, Z) A (Z > 15) A compat ible(X . Y) => cxpensivtMeal(X , V) 

: beef{X) A red\Vine{Y ) => cornpatibte(X.Y) 
r h : dessert IX) A sweetWine (V ) =i> compntible(X . V) 

Ordinarily, we would, have one process- for the. predicate dish that would send its 
answers to. processes of cheapMeal and expensive Meal. However, constructing a 
message passing. scheme based on -the query-tree will result in two processes for dish. 
one computing cheap beef dishes (and sending its answers to the process cheapMeal) 
and the other computing expensive dessert dishes. As a result, the cost to compute 
the joins (in rules r 2 atid f;j) is significantly red uced. 

Deriving Optimal. Search Strategies 

The query-tree implicitly encodes the space of.derivations that an inference engine 
should search. 1 he novelty of the query-tree is that it encodes a subset of the space 
that would have been searched by ordinary backward chaining, and therefore following 
the query-tree enables pruning of parts of the search space. A different approach that 
was considered to speeding up inference is finding optimal strategies for searching a 
given space [Smith, 1986; Greiner. 1991; Greiner. 1992). 

query-tree can be used to complement and extend these methods in two ways. 
1* irst, by delimiting the actual space that needs to be searched, some search paths can 
be eliminated from consideration when looking for the optimal search strategy. Sec- 
ond. the methods described by Smith and Greiner require a graph-like representation 
of the possible derivations of the query. The query-tree provides such a represen- 
tation which treats recursion and interpreted literals in a principled way, unlike the 
representations that are currently used. Consequently, it can be used as a basis for 
extending such techniques to fully incorporate knowledge about interpreted literals. 
In particular, the query-tree can be used to extend Greiner’s. algorithm [Greiner, 1991] 

•' T knowledge bases with recursive rules. 

The goal of Explanation Based Learning [Mintort ct a/., 1989] is also to speed 
up inferences. In EBL, new rules arc added to the knowledge base that compress 
sequences of inference into a single rule. The sequences are learned by examining 
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derivations of observed queries. The key issue in this approach is the utility of the 
added rules [Minton. 1988; Etzioni and Minton. 1992). Adding too fit any rules may 
have the inverse effect of slowing down inference. Moreover, the learned rules may 
be long and require many unification operations. Etzioni [Etzioni. .1993) has shown 
that much of the speedups obtained by EBL can be obtained by merely doing static 
analysis of the rules in the knowledge base. I’sing a tree-like representation of the 
rides in the knowledge base, called the Problem Space Graph (PSG). he showed how 
to glean from it new rules that were more. effective than those learned by standard 
EBL techniques. 

The problem space graph is similar in principle to the query-tree. However it does 
not consider the semantics of interpreted literals in the rules. It also uses a very simple 
termination condition in the case of recursion; a node is not expanded if it is a variable 
renaming of one of its ancestors. It is therefore possible to extend Etzioni’s techniques 
by refining the. construction of the PSG- with the labeling schemes employed by the 
query-tree. By attaching_constraint labels to the nodes, . we can discover additional 
sequences of actions that are guaranteed to .fail. We can also attach tag-labels to 
nodes and use them to find sequences of actions that necessarily contain loops. A key 
difference between the PSG and the query-tree is that the decision. whether to expand 
a node in the PSG depends partially on its ancestors. In contrast, the information 
used in order to decide whether to expand .a node in the query-tree is encoded in the 
node itself. To fully integrate the query-tree and the PSG we need to find methods 
to terminate the construction of .the PSG based only on local criteria. It should be 
noted that if the termination condition depends On the ancestors of a node, the size 
of the resulting tree can be exponential in the dumber .of rules. In contrast, the size 
of. the query-tree may be exponential only id the aritv of the predicates. 


The Query- Tree in Knowledge Acquisition 

A different use of the query-tree is as a tool for knowledge acquisition and knowledge 
base management. The query-tree essentially gives us a view (or picture) of the 
knowledge base relative to a query. It shows us exactly which derivations can be made 
and what formulas can be used id such derivations. A key problem in .knowledge 
acquisition is that as the knowledge-base grows, it is very hard to understand the 
interactions between the rules and the effects of changes. The query-tree shows us 
visually the dependence between rules and formulas. If a rule is removed, it can tell 
us that certain formulas have become irrelevant to the query, whereas if a rule is 
added, it can show us a dependency we have riOt anticipated. In that way it can. also 
help Us find erroneous rules (c.g., ad over simplified or overly specific rule). 
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4.4 Summary 

The query-tree is a very useful tool for many purposes, lrt this chapter we have 
explored only a few of its uses. We have shown that using the query-tree we can 
obtain significant speedups of inference, sometimes A few orders of magnitude. We 
have shown the query-tree, to be a useful tool in deriving conclusions from external 
irrelevance claims and in extending other existing query evaluation methods. 

Based on the observations made in this chapter, we can address one of the funda- 
mental questions .regarding the. usage of explicit irrelevance claims. Namely, should 
we enable users to give a system explicit irrelevance claims that are based on addi- 
tional knowledge that they may have, or should we require that they give the system 
the knowledge about the domain that underlies these irrelevance claims and develop 
methods for exploiting such knowledge to. control-inference. For example, instead. of 
telling the system that flights are irrelevant to the query Route(S.F , Z..4,$90), tell the 
system that: 

• All flights cost more than 5? 100. 

• Costs of busses and flights are all positive numbers. 

t The sum of two positive numbers is positive, etc. 

The system could then automatically derive that flights will be irrelevant to this 
query. The advantage of this approach 11 is that -the underlying knowledge may be 
used in more flexible ways (e.g M it may be used for. other queries as well). If the 
knowledge underlying the irrelevance claims changes, the system can automatically 
derive new irrelevance claims. 

In general, this argument has much merit. When the knowledge underlying the 
irrelevance claims is available, there are clear advantages to giving a system that 
knowledge. In fact, the query-tree is a method for exploiting such knowledge effec- 
tively. However, there are several cases in which explici t irr elevance claims will be 
very useful: 

L It may hot be possible to provide the knowledge underlying the ifrelevance 
Claims because of the expressive limits of the language being used. Going be- 
yond the expressivity of the given language may affect the performance of the 
system significantly (even assuming it can support inference in more expressive 
languages). For example, stating that the join of two relation s is empty ca nnot 
be done with Horn rules. 


"Advocated to me by Malt Ginsberg 
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2. Providing the additional domain knowledge underlying an irrelevance claim may 
require adding a level of detail that is unwanted. For example, it may require 
adding new objects or predicates (in our example, the axioms' of arithmetic), 
that will ultimately make the representation more complex and slow down in- 
ference. 

3. The knowledge underlying the irrelevance claims may be of heuristic nature, 
and the user may not know the knowledge underlying it. 

4. Irrelevance claims can be based on cached inferences made from the underlying 
domain knowledge. As long as their justifications are maintained, using theni 
at run-time will lead to Significant savings.. The experimental results presented 
in this chapter can be viewed as a validation for this argument. The computa- 
tion done by the query-tree and .the indices created are actually precomputing 
irrelevance claims. Though these computations cari.be done at run-time, the 
experiments show that it is much less beneficial to do so. 

5. As we see in Chapter 7. in sorrie contexts we .are given theories in which certain 
simplifying assumptions are made about the domain. In such cases, an .explicit 
representation of these assumptions (via irrelevance Claims) is useful in deciding 
when to use the given theories. 


4.4.1 Related Work « 

In addition to the work described in the previous section, there are several other 
works related to the topics discussed in this chapter. . 

Building a query-tree and the corresponding indices for a query can be viewed as 
ari instance of a general framework for knowledge compilation discussed in [Selman 
arid Kautz, 1991]. In their framework, a new simpler knowledge base is created such 
that it will yield faster answers for a large number of the queries. For example, 
they show how to create Horn approximations of a theory that can be used in many 
cases to answer the query. One key difference between these approaches is that our 
transformed knowledge bdse is built with respect to a known set of queries., and for 
these queries inference will be more efficient. 

A related approach is that of partial evaluation and in particular, partial eval- 
uation of logic programs [Smith and Hickey, 1990; Lloyd and Shepherdson, 1991; 
Bruynooghe d al., 1991]. Partial evaluation attempts to compile a set of rules in a 
way that will be efficient for a known set of queries.. The query-tree method can be 
viewed as a generalization of previous methods for partial evaluation of constraint 
logic programs, in particular, work in logic programming has not emphasized the 
distinction between the rules in the program and the set of ground formulas, wherein 
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our approach argues that this distinction is necessary for our algorithms to be of 
practical interest. Moreover, the query-tree is the only partial evaluation procedure 
that yields the tightest constraints on the possible ground formulas that Can appear 
in derivations of the query.. 

We have shown that the query-tree can be used to derive. logical conclusions of 
irrelevance claims that are given to the system. A different approach, described 
in [Subranianian and Genesereth. 1987; Subramanian. 1989] is to give an axiomatiza- 
tion of irrelevance and use the axioms to reason about irrelevance claims. Conceivably, 
the same could be done in our framework by giving axiomatizations (or partial ax- 
iomatizations) to the various kinds of irrelevance claims in our space of definitions 
(in fact. Chapter 2 presents properties of irrelevance claims that can form a basis for 
such an axiomatization).- However, we preferred to pursue the algorithmic approach 
since it is likely to be more efficient and its results are easier to characterize. 


Chapter 5 

Independence of Queries From 
Updates 


This chapter 'considers the problem of detecting when a query is independent of an. 
update .to the knowledge base. .This .problem is primarily important because it enables 
us. to save the computation needed. to reevaluate a query after updates. Detecting in-— . 
dependence is also a key issue in-developing heterogeneous and distributed knowledge 
base systems [G.enesereth, 1992; Litwin et a/.,. 1990]. In.such systems, updates, in one 
knowledge base may trigger updates in an other. For example, an important applica- 
tion that gives rise to such .a setting is concurrent engineering [Cutkosky et al ., 1993; 
Levitt et al., 199l], where several agents may be working on different parts of one 
design. Design decisions made by one agent may impose constraints on the possible 
design decisions of another agent, and therefore must be communicated. However, in 
order not to be burdened by excessive communication, only the changes that affect the 
other agents must be communicated. In database systems, detecting independence is 
important for several reasons. It can be used in order to maintain materialized views 
effectively. 1 In transaction scheduling we can provide greater flexibility by identifying 
that one transaction is independent of updates made by another. Finally, we can 
use indep j ndence in query optimization by ignoring parts of the database for which 
updates do not affect a specific query. 

In this chapter we relate the independence problem to our framework for reasoning 
about irrelevance. We show that detecting independence is equivalent to detecting 
weak irrelevance. Making this connection sheds light ort the independence problem, 
and enables us to significantly improve previous results in this area. In general, de- 
tecting independence is undecidable. However, viewing independence as a problem 

l A view is a portion or an abstraction of a database. For example, a view may consist of an IDB 
relation. A materialized view is one that is maintained computed, as opposed to computing it on 
demand. 
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of detecting weak irrelevance yields new algorithms that provide sufficient conditions 
for independence by considering algorithms that provide sufficient conditions for de- 
tecting weak irrelevance. One such sufficient condition is strong irrelevance that we 
have examined in detail in Chapter 3. 

A second sufficient condition for the case of Horn rule knowledge. bases is based 
on the observation that detecting weak irrelevance can be couched as a problem Of 
detecting equivalence of datalog programs. The notion of uniform equivalence , intro- 
duced in [Sagiv, 1988]. can be used to provide a sufficient condition for equivalence of 
datalog programs. In order to use uniform equivalence for detecting independence, we 
extend the algorithm described in [Sagiv, 1988] to programs with interpreted literals 
and stratified negation.. The result provides new decidable cases for -independence 
and weak irrelevance and sound algorithms for the general case. 

Our results significantly Extend the known previous results on detecting indepen- 
dence. Specifically, it is shown that the results of [Blakeley et al ., 1989;.Elkan, 1990], 
only capture strong irrelevance in datalog knowledge bases without recursion with. ad-,, 
ditional restrictions on the rules. Our results extend the previous. ones in two ways. 
We provide a. strong irrelevance test to arbitrary datalog KBs (and the extensions de- 
scribed in Section 3.4), and we provide independence tests based On weak irrelevance 
which capture- a larger class_of independence than strong irreleva nce. 


5.1 Definitions 

In this chapter we will consider knowledge bases containing a set of datalog (cf. [U11-. 
man, 1989]) rules V and a set of ground formulas. We refer to the rules as a datalog 
program. We also allow the rules to have safe stratified negation. We denote the 
EDB predicates by et....,e n and the IDB predicates by it,.. . ,i h . The input to a ) 

datalog program V is an EDB, i.e., a set of ground formulas for the EDB predicates, j 

. . , E n . We can also view E\, . . ... , E n as relations for the EDB predicates in the 
intended interpretation of the knowledge base. A bottom-up evaluation is one iii 

which we start with the ground EDB formulas and apply the rules to derive formulas 

for the IDB predicates. We continue applying the rules until ho new formulas are 
generated. j 

We distinguish one IDB predicate as the query predicate. The output Of the 

program V for the input Ei, . . . , E*, denoted V[Ei Em), is the set of all ground j 

formulas generated for the query predicate in the bottom-.up evaluation. The query 

predicate is usually denoted as q. 

A datalog program V is said to be monolonit ( antUfndridtdnic ) in the EDB pred- 
icates if for any input EDBs Di and D 2 , D\ 3 D 2 implies that V{D\) D "PiDz) 

( V{D \ ) C V(.D 2 )). Containment and. equivalence of datalog programs are defined as 
follows. 
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Definition 5.1: (Containment) A datalog program V\ contains a program P 2h 
written P 2 G "Pj, if for all EDBs E\,. . . . E m , the output of V\ contains that of P 2 , 
he., T-i{E\ Em) C T\[E X E rt ). I 

Two programs V, arid P 2 are equivalent, written P x = P 2 , if P 2 C Pj arid 
Pi C p 2 - Containment of datalog programs is undecidable [Shmueli, 1987], even for 
programs without interpreted predicates or stratified negation. However, a weaker 
condition, uniform containment , was introduced and si vn to be decidable in [Sagiv, 
1988] for programs without interpreted predicates or stratified negation. In defin- 
ing uniform containment, we assume that the input to a program V consists of re- 
lations for the EDB predicates E\ E m as well as initial relations for the IDB 

predicates ./°, .... /°. 2 3 . The output of program V for . . . , /?, . . . , written 

P{E\, E m ,I*i , /°), * s computed as earlier by applying rules bottom-up until 

no new formulas are generated. When .dealing with uniform containment (equiva- 
lence). we assume that the output is not just the relation for the. query predicate 
but rather relations for all the IDB predicates / 1 , . . . , I n computed for the predicates 
z n , respectively. An out put A , /„ contains another output if 


Definition 5.2: (Uniform Containment) A program P x uniformly contains P 2) 
written P 2 P u if for all EDBs.E!, . . . , E m and for all initial IDBs /f, . . . , /°, 


PiiE 


U • 


£ m ,/p...,/ n °)C? 1 (£ 1 ,...,E m ,/? / n °). 


I 

Two programs P x and P 2 are uniformly equivalent, written P\ = u P 2y if P 2 C u 
P\ and Pi G u P 2 . Uniform containment can also be explained in model-theoretic 
terms [Sagiv, 1988]. The uniform containment P 2 C u P x holds if and only if M{Pi) C 
M{P 2 ), where M{P ,) denotes the set of all models of P X P Furthermore, for programs 
having only EDB predicates in bodies of rules, uniform containment is the same as 
containment. Note that a non-recursive program with rio negation cari be transformed 
into this form by unfolding of the rules. Consequently, uniform equivalence provides 
a necessary condition for equivalence for non-recursive programs. 

2 Note that in defining equivalence, the initial relations for the IDB predicates are assumed to be 
empty. 

3 Nbtc that, in contrast, containment holds if the set of minimal fixeckpoirit models. of V.\ are 
contained in_lhose of Pj. 
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5.1.1 Updates and Independence 

Given a datalog program V. which we call the query program, we consider updates to 

the EDB predicates of V. denoted. by e\ ,e m . In ah update, we either remove or 

add ground formulas to the extensional database. To simplify notation, we assume 
that updates are always done on the relation E\ for ei. To specify the set of formulas 
that are updated in E x . we assume we have another datalog program, called the 
update program, denoted by V u . The query predicate of V u is u, and its arity is equal 
to that of The tuples computed for u will be the set of tuples updated in E\. 

We assume (without loss of generality) that the IDB predicates of V u are different 
from those of V. The EDB predicates of V u , however, could be EDB predicates of 
"P,.as well as predicates not appearing in V. To distinguish the two sets of EDB 
predicates, we will use the phrase “EDB predicates” to refer exclusively to the EDB. 

predicates e l5 e m of the query program V.\ the other extensional predicates that. 

may appear in the update program are referred to as base predicates , denoted by 
,6/.. Of course, some of the EDB predicates, may also appear in the update pro=~ 
gram V u . We denote the output, of update program V u as 'P u (£ l , ... . , E m , £*, . . . , B/), 
even if 'P u ~ does not use all (or any) of the EDB predicates, Sometimes we refer to 
the output of V u simply by U.. 

An update is either an insertion or a dele 'ion and it applies to the relation E\ for 
the EDB predicate e'i. The tuples to be inserted into or deleted from Ei are those in 
the relation computed for u. A large. class of updates consists of those not depending 
on the EDB relations, as captured. by the following definition: 

Definition 5.3: (Oblivious Update) An update specified by an update prograih 
V u is oblivious with respect to a query program V i f.P u has only base predicates (and 
no EDB predicates). An update is nonoblivious if the update program V u has some 
EDB predicates (and possibly some base predicates), fe 

To define independence, suppose we are given a query program V and an update 
program V u . The program V is independent of the given update if the update does 
not change the answer to the query predicate. The Formal definition is as follows. 

Definition 5.4: (Independence): A program V is independent of an update spec- 
ified by a program V u if for all EDB relations E\. , . « % and for all base relations 
B\ B t , 

P(E l ,E 2 :...:E n ) = Vl£[.E 2 ,...,E*) i 

where E[ is the result of applying the update to E\\ that is, E[ = E\ U C/ - if -the 
update is an insertion, and E{ — E\ — U if the update is a deletion, where U = 
P U {E \,. . . , E n , Bi Eli), | 
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We. use the following notation. hC(P.P u ) means that, program P is independent 
Of the insertion specified by the update program P u . Similarly. In~{P.P") means 
that program V is independent of the deletion specified by the update program P u . 

Example 5.5: Consider the following program P \ : 

inCar{X,Y .4) A driver(X) A inCar(2 . Y B) A B > IS =» canDrive{X. V, A) 
caiiDrive{X. V. A) A A > IS => adult Drivef(X) 

An atom can Driref A’. V. .4) is true if person A' cart drive car V and .4 is the age of 
X. According to the rule for can Drive, person A' can drive car Y if A' is a driver and 
there is someone of the age lS.or older in. the same car. An adult driver, as computed 
by the ID.B predicate adultDriver. is anyone who can drive a car and is of the .age 
IS or older. Let the update program P“ consist of the rule: 

inCar(X,)\ A) A -'driver(X) A A < IS t/|.( A’, V. .4) 

and suppose that the deletion defined .bv u\ is applied to inC'ar\ that is.. non-drivers 
under the age of IS -are removed from. inC'tir. The query predicate adultDriver is 
independent of the deletion update V* because the existence of nomdrivers under the 
age of 18 does not affect the ability to derive that a person can drive. | 

Several properties of independence are-shown by Elkan [Elkan, 1990].. In particu- 
lar. he showed the-follow-ing. 

Lemma 5.6: Consider a query program V arid an update program P u . If P u is 

monotonic in the EDB predicates, and P is either monotonic of anti-fnonotonic in 
the EDB predicates, then 

In~{P y P u ) ^ In+{P,P U ). 

Similarly to the above lemma, we can also prove the following. 

Lemma 5.7: Consider a query program P and fir i update program P * . If P u - is 

aiiti-inonotonic in the EDB predicates, and P is cither monotonic or anti-monotonia 
in the EDB predicates, then 

ln + .(P.P u ) => In’[P.P u ). _ 

Proof: Consider an-EDB E\ — denoted as E. and relations- B\ de- 

noted as B. for the base predicates. The tuples of the update are given by U = 

P U \E-.B). A deletion update transforms the- EDB £ into the EDB £j - V E n , 

denoted as E~ . We have to show the following: 

VIE’) = P(E). 
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Consider the EDB E~ with the relations. B for the base predicates. Let V - 
T U (E~ , B). Since P u is anti-monotonic in the EDB. V C U'. 

We now apply the insertion update specified. by C' = T U (E~ . B) to E ~ . yielding 

(£, -r)U/"./-, E n . 

SinCe I.n + (V,P' 1 ) is assumed, we get. 

V(E~) = V({E l _ = _0±LU'.E 2 E n ). (5.1) 

Moreover. C C C implies 

£i -r c £, C (Ej - f 0 U f (5.2) 

If V is monotonic in the EDB. then (5.2) implies 

V(E~) C T>(£) C P((E 1 -C)U('',E 2 E n ), . 

and. so, from (5.1) we get 

V.(E~) a V(E). 

Similarly, if V is anti-monotonic in the EDB, then (5.2) implies 

T[E~) D V{E) D P((£,.-!/)ur,£2 B n ). 

and, so. from (5.1) we get 


V{E~) = V[E). 

I 

Note that if an update is oblivious, then it is both monotonic and anthmonotoiiic. 
Therefore, the above. two lemmas imply the following corollary. 

Corollary 5*8: Consider a query program P and an update program V u . If the 

update is oblivious (i.e., EDB predicates of V do not appear iti.V a ), mid V is -ci- 
ther tnonolonic or afiti-mbnotonic in the updated EDB predicates, then the following 
equivalence holds: 

hr.[V t V u ) w In + {PUV u ). 

The importance of.Lernnia 5.7 and . Corollary 5.8, as we will see iii the next section, 
lies in the fact that testing In+{P.V U ) is usually easier than testing Iti~(V y V' i ). 
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5*2 Irrelevance, Independence and Equivalence 

In this section, we formalize the connection between three related problems: irrele^ 
vance of formulas and independence and equivalence of datalog programs. We show 
that independence of a deletion update is equivalent to a form of weak irrelevance; 
We also show that the independence problem can be formulated as a problem of 
equivalence of two datalog programs. We exploit this connection as follows: 

• We develop algorithms that provide a Sufficient condition for independence 
based oh strong irrelevance. 

• We develop novel algorithms for detecting equivalence of datalog programs. As 
a result, we get algorithms for detecting independence and Weak irrelevance. 

Irrelevance and Independence .. 

As stated, independence of a deletion update is.equivalent to w'eak irrelevance. Given 
an update program P u and a database D. (consisting the EDB relations and the base 
relations), we denote by E( (D) all the formulas of the form ei(°i> s . . , a n ), where 
(dm a h )eV u (D). 


Lemma 5.9: Let V' be a datalog program with query predicate q and P n be an update 
program . where bothV and V u ~have no negation. The independence In~(P ) P u ) holds 
if arid only if for any database D, H’ /( E['(E), <?. T U D, DI u V q ) holds. 

Proof: Assume that In~[P.P u ) holds, and let D = £i, . . . , E„, B,, . . . , B m be an 
arbitrary database and let <j(d) € P{D). To show weak irrelevance, we must show that - 
q[d) ha$ a derivation that does hot use formulas in E\ . However, since In~(P,P u ) 
holds.. 

P(E l ,...,E n ) = P(E l -LUE 2 £„), 

where V = P U {D). This means that q(a) has a derivation d from E\ - C, £ 2 > . . . , E rt , 
and d does hot contain formulas in 

Conversely, suppose that H7(£['( D). q, P U D>Dl\.V q ) holds for any database 
D. Clearly, 

V(Ei,,.., E n ) DP(Ei-U £,,). 

To show independence, we heed to sjidw that 

P(Eu...'E n )QP(E l -l\...J- n ). 

Let q{ a) € P[E » , . . . , £„). Since E | is. weakly irrelevant- to q, q(a) Will have a deriva ; 
tion d that dors not contain formulas in £['. Therefore, d will also be a derivation of 
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q(a) from W Ei - t \ £ 2 . . « » . E ri . and so q(a) €_P_{E\_z U E n ) aiid the indepen- 

dence holds. | 

Elkaii [Elkaii, 1990] shows that detecting independence in general is undecidable. 
This also follows from Lemma 5.9, since weak irrelevance is undecidable in general. 
However, viewing independence as weak, irrelevance provides insight into the prob- 
lem of detecting independence. For example, the following Corollary will yield an 
algorithm for detecting independence. .. 

Corollary 5 . 10 : If for any database D, S I(E[ (D).q-.V U D.DI\.V q ) holds , then 
In~{T,V u ). 

The corollary follows front Lemma 5.9 and Lemma 2.7. To detect strong irrele- 
vance. of E\ . we. create a ne.w datalog program that explicitly contains a predicate, 
representing the relation E[' . Specifically, given the rules V and P u .. we create a new 
program V\ as follows:. 

1. V\ includes the rules of V and P u , 

’L V\ includes the rule 

e 1 (.V)A I i(.V]^(.V) 
that defines the relation E[' . 

3. V[ includes rules that enable using formulas of e“. whenever the corresponding 
formulas of e\ would .be used. Specifically, let eR.Y) be some occurrence of e\ 
in a rule r € V. The program V\ includes the rule r' created bv re placing the 
literal e j ( . Y ) in the antecedent of r by e“(.Y). 

The following lemma assures that detecting strong irrelevance of formulas of e* in 
V\ will entail independence of V u \ 

Lenirria 5.11: Let D be a database arid ej(d) be a formula such that u(d) € P U (D). 
Then e\(a) is part of a derivation of the query from PU D if and only if e“(a) is part 
of a derivation of the query from V\ U D. 

Proof: Let d be a derivation from PU D that contains ci(d); i.e.. d uses an instance 
of a rule r of the form 

Ct(rt) A Si(d,) A . . - A S([di) =t* p(6). 

We can assume that the leftmost literal in the antecedent is e^a). Recall that d 6 E\' . 
and therefore, there is a derivation d u of ti(a) from Pi U D (using only the rules 
coming from P“). The program 'Pi contains a rule r' in which the leftmost literal in 
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the antecedent of r. fi(.V). is replaced by c“(.V). To create a derivation of the query 
d' that uses (d ) , we replace r by r 1 . The goal-node f“(d) is unified with the rule 
defining c“. resulting in the instantiated rule 

A u(a) => 

The atom ej(d) is obviously satisfied (because it Was satisfied in d). To complete </'. 
we make the atom u(a) the head of the derivation d u . 

For the other direction, let d be a derivation from U -D that contains e\{a). We 

simply reverse the transformation we performed above and.get a derivation d' of the 
query from V U D. I 

Consequently, .if there, is no node of In the query-tree of V\, then E\'(D) is 
strongly irrelevant to the query, arid therefore, by Corollary 5.10, ln~{V, V u ) holds. 

Example 5.12: Consider again our goodPath example given by the rules:. 

>'*1 : badPoint(A) A path(X,\ ) A. goodPoint[Y) =£• goodPathlX, V) 

r 2 : link(.X>}’) => path[. Y. V') 

r 3 ; link(X, Z).A path(Z,Y) pnth( A'. V) 

r4jjstep[X,Y) =* link[X,Y) 

and the additional constraints:.. 

badPoint{.X) =* 100 < X < 200 

step(X, Y) X < Y 

good Point (.X) =* 150 < X < 1?0. 

Suppose we want to remove the formulas of step(X.Y) for which A' < 90. We would 
add the following rules to the program: 

r 5 : stepiXs Y) A (.V < 90) =s> lowStep{X,Y ) 
r e : lowStcp{ A\ Y) => link{X , V’). 

The query-tree built for goodPath will be identical to the one shown in Figure 4.1 and 
will riot contain a node of the predicate lowStcp. Therefore, goodPath is independent 
of this update. I 

Independence and Equivalence 

The independence problern can also be formulated as a problem Of detecting equiva- 
lence of datalog prograrns. To show that, we construct a new program that computes 
the new value of the query predicate q (after the update) from the old value of the 
EDB (before the update). One program, ‘P+, is constructed for the case of insertion, 
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and another program, V ~ . is constructed for the case of deletion. We then pose the 
independence problem as the equivalence of the original program and V + (or V~). 
Eac'h of P + and V~ consists of. three parts: 

• The rules of V, after all occurrences of the predicate name f! have been replaced 
by a new predicate name S. 

• The rules of the update program V?. 

• Rules for the new predicate s. 

7 ,+ arid V~ differ orily in the third part, in the case of insertion, the predicate s in 
is intended to represent the relatiori E\ after the update, and therefore the rules 
for s are; 

M-Y, A'k) =* -s ( . Y .Y,) 

u(.Y, .Yfci^sf.Y, Xk) .. 

Iri the case of deletion, the predicate S in V~ is intended to represerit the deletion 
update to Ei, and the rule for defining it is 

fil-Vt AV) A s(. Y, Xk)- 

Note that since 'P'and V u do. not share IDB predicates, the negation in the V~ will 
be stratified. The following propositions are immediate corollaries of the definition of 
independence. 

Proposition 5.13: In + (V,V U ) <==> V = V + . 

Proposition 5.14: /t?“(‘P,7 3U ) 4=> V = V~ . 

Proof: Both propositions follow* from the observation that the relation conriputed for 
s is the updated relation for Therefore, since tq is replaced by s iri the rules of the 
program, the new program will compute the relation for q after the update. Clearly, 
the independence holds if arid only if the rie\v prograni is equivalent to the original 
prograrn. I 

Returning to Exariiplc_5.5, both 'P + and V~ will have the. following rules; 

,s( A'., V', .4) A drivef{. X) A s[Z, Y\.B) A B > 18 =*• canDrive{X, V, .4) 
canDrive(X,Y,.A) A .4 > 18 =$> adult Driver(X) 
inCar { X, V, .4) A ->drivcr[ X ) A .4 < l8.=> Uj(.Y, V-, ,4): 

The program P + will contain the rules: 
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ivCar(\\ Y\ A) => s(A\V,A) 

m,(.Y. V',.4) ^ s{X, Y.A) 

and the program V~ will contain the rule: 

inCar(X, Y. A) A -.u,( X. Y.A) => ,s(A\ Y*..4). 

Both 'P + and T~ are equivalent to the original program, and therefore, the query 
adult Driver(. Y) is independent of the insertion and deletion updates of 7 ?u . 

5.2.1 Understanding Previous Work 

Relating the independence problem to irrelevance enables us to understand better 
previous. Work on independence by (Blakeley et al., 1989] and.[Elkan, 1990]. Both of 
them considered restricted languages in which weak irrelevance is the same as strong 
irrelevance. The result of Blakeley et al. [Blakeley et at ., 1989] applies just to conjunc- 
tive, queries (cf. [Ullman, 1989]), i.e., knowledge bases in which the. antecedents of 
every rule are EDB predicates. Furthermore, the rules are restricted such that every 
predicate can only appear once in the antecedent. .Elkan generalizes. the result by. 
Blakeley et al. ta deal with interpreted constraints and only requires that the query. 
be conjunctive in the updated predicate, as defined below.. In the definition, Def(q) 
denotes all the predicates that can. appear in a derivation of-qr: 

Definition 5 . 15 : A query q is conjunctive in the updated predicate if it is defined 
by a single rule of the form: 

Ci[Y) A Sjf.Yi) A ... A Sri(A'n) =* <?(.Y), 

where e l has a single appearance in the rule, and e t £ Def(s,) for 1 < i < n. | 

Under this restriction, weak irrelevance is equivalent to strong irrelevance: 

Observation 5 . 16 : Let the query q be conjunctive in the predicate then for any 
formula e, (6), W'/(e,(6) l9 ,£ 7 , t D/,.Z>,) <^> Slici(b).q. tp, DIuV,). 

Proof: We only need to show that H7(ei(6),q, Sr, DI\ { V q ) =4> £/(eRb), <?,£*>, 2?,). 
Assume by contradiction that M7(ci(6),g, S|>, holds, and suppose D is a 

database in which ej (6) is used in a derivation d of q{a), Consider 'he database 
D 1 which is identical to D except that the relation E\ includes only the tuple 6. 
The derivation d is still .'a valid derivation of q{a) from P' t since eq is only used 
once in d. However, there is no derivation of q(d) from D 1 that does hot use ei(6), 
because some formula of t\ must be used in the derivation of q{a), Consequently, 
li / /(ei(6),qf,Sp i cannot hold. I 
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I'sing the query-tree to detect independence is an immediate generalization of 
Elkan's algorithm. It provides a strong irrelevance test for arbitrary datalog programs, 
thereby removing the restrictions that the program cannot have recursion and that 
the query must be conjunctive in the updated predicate. 

In the next section, we describe algorithms for deciding equivalence of datalog 
programs and therefore for detecting weak irrelevance. The results provide new de- 
cidable cases for independence and provide sufficient conditions for independence in 
the general case. 

To illustrate the added power of.our. algorithm, consider the query adult Driver(X) 
when the update program is: 

inCar{X. V. .4) A ->driver(. Y) =*• u 2 (A'.. V'. A). 

Suppose we apply the deletion defined by_u 2 to the relation inCar , i.e.. we remove 
any person who is not a driver. The query adultDriver is independent of .this deletion 
update (because A' and Z can be bound to. the same constant in the rule defining 
canDrive). However, there are derivations of adult Driver( A'). that use adult non- 
drivers, and therefore the formulas computed by Ui are not strongly irrelevant to the 
query adultDriver.. Consequently, Elkan's algori’hm will not detect the independence 
in this .example. 

5.3 Testing Equivalence of Datalog Programs 

In the remainder of this chapter we consider the problem of testing equivalence of 
datalog programs. As stated earlier, solutions to this problem directly impact the 
independence problem:. Shmueli (Shmueli, 1987] showed that detecting equivalence of 
two datalog programs is in general undecidable even if the programs do not contain . 
interpreted predicates or negation. Sagiv [Sagiv, 1988] introduced a weaker condition, 
uniform equivalence, and showed that it is decidable for datalog programs without 
interpreted predicates or negation. Recall that the reduction of the problem of ifn 
dependence to equivalence involved testing equivalence of programs with stratified 
negation. Therefore, in order to use uniform equivalence for detecting independence, 
we extend the algorithms described iii [Sagiv. 1988] to handle both interpreted predi- 
cates (which are defined in Section 2.-1) and stratified negation. Specifically, we show 
the following: 

Theorem 5.17: Testing whether a datalog program V\ is uniformly equivalent to a 
Htitalog program Vi is. decidable - even if V\ and Vi include interpreted predicates arid 
stratified negation. 

As a consequence .of this theorem, we get the following results for testing inde- 
pendence: 
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Corollary 5.18: Independence is decidable in the following cases: 

1. In + {T ,V U ) (In~{V, V u ) ) is decidable if both V + (V~ ) and V have only inter- 
preted .and EDB predicates (that may appear positively or negatively) in bodies 
of rules,* 

2. Both /ri + ('P,'P ti ) and In~{ V,V U ) are decidable .if V is non-recursive, and T" 1 
has only rules of (he form 

ei(.Y, A'*) Ac Y, Y fc ), 

where c is a conjunction of interpreted literals such that ->c is expressible in the 
constraint language, 

Proof: The first half of the corollary follows from the observation that for. these 
classes of programs, uniform equivalence is also a necessary condition for equivalence. 
The second half holds because in both V + and V~- the rules defining s will not contain 
negation and therefore both V + and V~ can be rewritten equivalently to satisfy the 
conditions of the first half of the corollary. I 

Corollary 5.19: 

/. In*(V\V u ) is decidable if both V and V u are non-recursive and only EDB 
predicates appear .negated, 

2. If, in addition, the update is oblivious, then In~('P,T xl ) is decidable. 

Proof: The first half follows from the observation that under the conditions stated, 
both V and V + ~ can be rewritten to satisfy the condition of Corollary 5.18. The 
second half follows from the first and from Corollary 5.8. I 

5.3.1 Uniform Equivalence with Interpreted Predicates 

The algorithm for detecting uniform containment (and equivalence) for datalog pro- 
grams without interpreted predicates is based on the model theoretic characterization 
of the notion, shown in [Sagiv, 1988], which also holds for programs with interpreted 
predicates. Specifically, the uniform .containment V? C u V\ holds if and only if 
M(V\) C A/(T’ 2 ) 1 where M(V,) denotes the set of all models of V t A We note that 

* We prefer to describe this case in terms of V and V * , father than V and V' 1 , since it is clearer. 
s Note that when we consider the case of interpreted predicates, the models of V , must map the 
interpreted predicates to their natural interpretation. 
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M(V i) C M(V-i) holds if and only if M{V i) C .\/(r) for every rule r € Vi, since a 
database D is a model of Vi if and only if it is a model of every rule r € V 2 . There- 
fore. w'e can decide whether :\I(V i) C M{V 2 ) by checking whether M[V \ ) C ,W(f) 
for. every r € V 2 . 

Based on this observation, when the programs have no interpreted predicates, the 
following algorithm (from [Sagiv, 198$]) will decide whether a given ruler is uniformly 
contained in a program V. Given a rule r of the form 

</! A . . . A q n => p, , 

we use a substitution 9 that maps every variable in the antecedent of r to a distinct 
symbol.that does not appear in V or r. We then apply the program V to the atoms 
q x 9 , ... , q n 9. Sagiv shows that the program V ge nerat es p9 from q\9 , .... q n 6 if and 
only if M{V) QM{r). 

Example 5.2Q: Suppose we are trying to determine whether the rule 

r t :e(A\Z)Ap(Z.V)=»p(A\r) _ 

is contained in the program P\\ ■ 

p(X.Z) Ap(Z,V')^p(.V, y) 
c(X.Y)=*p(\\Y). 

We apply Pj to the ground atoms e(arQ.c 0 ) and p{z Q ,y Q ). Since we are able to derive 
p(:r o,yb), the rule r x is contained in P\, I 

However, there is a problem in applying this algorithm to programs with inter- 
preted predicates. First, the constants used in the input to T 3 , i.e., those that appear 

in q\9 ,< 7 * 0 , are arbitrary, and therefore, interpreted predicates are not defined on 

them. Consequently, the interpreted literals in the rules (that may involve <, <, etc.) 
cannot be evaluated. Moreover, some of the derivations cf p9 by V depend on the 
symbols satisfying the interpreted constraints, and so these cannot be discarded. 

We address this problem by associating a constraint with every fact involved in 
the evaluation of V. the constraints for a given fact / represent the conditions on the 
constants in q\9, . . . , q n 9 under which / is derivable. We manipulate these constraints 
as we evaluate V, Formally, let f be the rulei. 


qi A . s . A ^ A c r => p, . (5.3) 

We denote the set of variables in r by V*. The subgoal c r is the conjunction of the 
literals of interpreted predicates in r. We assume that all literals in r have distinct 
variables in every argument position. Note that this requirement can always be 
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fulfilled by introducing additional literals using the = predicate. As in the original 
algorithm, we define a mapping 8 that maps each variable in r to a distinct symbol 

not appearing in V or r. Instead. of evaluating V with the ground atoms q x 8 q n 8. 

we evaluate V with facts that are pairs of the form (y,c), where q is a ground atom 
and c is a constraint on the symbols in Y8. The input to V will be the pairs ( qi8.c. r 8 ). 

for. i == 1,2 n. 

An application of a rule 

Q\ A ... A g x A c =$■ h 

proceeds as follows. Let (ai.c 1 ), . . . , (a/,c ; ) be pairs generated previously, such that 
there is a substitution r for which g t T = a, (1 < i < /). Let ca be the conjunction^ 
c 1 A. . .Ac 1 Act. If ca is satisfiable. we derive the pair (hf.ck). In words, the constraint 
of the new pair generated is the conjunction. of the constraints on the pairs used in 
the derivation and the constraints of the rule that was applied in that derivation. We 
apply the rules of V until no new pairs are generated. Note that there are only, a finite 
number of.possible constraints. for the generated pairs and, therefore, the bottom-up 
evaluation must terminate. 

Finally, let [p8.c x ) (pfLe m ) be all the pairs generated for the.atoir p6 in. the 

evaluation of P\ recall.that p is the head of Rule ( 5 . 3 ) and 8 is the substitution used to 
convert the variables of that rule to new symbols. As we will prove, the containment 
M{P) C A/(r) holds if and only if c r f=-ci V . . . V Cm, where c r is the conjunction of 
interpreted predicates from the antecedent of Rule ( 5 . 3 ). 

Example 5 . 21 : Let P x be the program: 

r, : e(.\\ Z) A p(Z, V) ■=> p(A\ V) 
r 2 :e(.V,lj^ ? (.Uj. 

Let V2 be the program: 

s, :p(A',Z) Ap(Z,V)=*p(A\V) 

32 : e(A\ V') A (.Y < Y) => p{X,Y) 
s 3 : e(A',V) A (Y < .V) => q{X,Y) 

^4 t p(A\ V) => q( A', >'). 

For a variable A' of a rule r, we denote the constant X8 by Jo. True denotes 
the constraint satisfied by all tuples. To check the uniform containment of r x in 
Pj, the input to P 2 would be (efJo.co). True) and (p(io.yo). True)* Rule s 2 will 
derive (p(jo,co)> x 0 < -o) and rule s t will then derive (p(j 0 ,yo), Jo < Co). Since 
p(j 0 ,yo) was only generated under .the constraint Jo < Co, the rule r t is not uniformly 
contained in V 2 . 

To dheck the uniform containment of rule r 2 in P 2 . we begin With (e(xo, yo), True). 
Rule S3 will then derive (y(jQ.y 0 ). y 0 < Jo). Rule s 2 will derive (p(xovyo), Xo < yo). 
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and rule .s A will.use that to derive (i 7(.ro,i/o ). Jo < !/o)- Since q{io<ya) "'as derived for 
both possible orderings of x 0 and y 0 , rule r 2 is uniformly contained in V 2 . However, 
since r± is not uniformly Contained .in V 2 . the program V\ is not contained in V 2 . I 

To prove the correctness of the algorithm, the following lemma relates derivations 
of pairs to those of ground atoms. 

Lemma 5.22: Let d be a derivation of the pair (p6.c) froth V and the database 

containing the pairs (qi6,C r 8) , (g,,#, c r 0), and let r be a substitution that maps 

each variable of r to a constant such that the constraint c is satisfied by Yr. Let d' 
be the derivation in which every node (nO.CaO) in d is replaced by the ground formula 
nr. The derivation. d! is a valid derivation of pr from V and Yt .. 

Proof: To prove the lemma, we need to show that in every rule application in <f , the 
constants, that are involved satisfy the interpreted constraints. The proof is based on 
the .following observation. The constraint c 0 in a node (n0,co0) in d is stronger than 
the constraints Of its subgoals and stronger than the conjunction of interpreted literals 
in the rule. used to derive. (n#, c o 0). This follows from, the way we evaluated V with 
pairs, where the constraint of the. head pair was the conjunction, of. the constraints 
of the rule being applied and the subgoals used. Therefore^ since Yr satisfies .the 
constraint in .the root of d, then it. satisfies the constraints of all the nodes -in d. I 

Based on this lemma, the correctness of the algorithm is established by the fol- 
lowing theorem. 

Theorem 5.23: M(P) C M(r) <=> c T f= Ci V . . . V c m . 

Proof: For the first direction, assume c r (= C| V. ... V Cm, We need to show that 
ftf(P) G A/(r). To show that, it is enough to show the following. If r is a substitution 
that itiaps each variable in Y to a constant Such that 

1. Yr satisfies c r , and 

2. <71 r — , q n T 6 A/, where M 6 A/(P), 

then pr € M. We .show that by showing that if P is applied to the inputs q\T , = . . ,q„r. 
then pr will be derived. 

Since Yf satisfies c r there exists at least one i, 1 < i < m, suCh that Yf satisfies 
c,. Consider the derivation of {pO.o,) from P. Lemma 5.22 guarantees that pr has a 
derivation from <71 r. . . . , q- n r. 

For the second direction, suppose c r C\ V . . , V c m . We need to show that there, 
exists a model of P that is not a model of r, By the assumption, there must be 
sonic instantiation of V.A'r-. such that Yr satisfies c r r but does not satisfy any of 
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the c,r’s. 6 Let M be the set of interpretations (for the predicates in r and P) which 
include qjT, . . . , q n r and do not include pr. Clearly, none of the interpretations in M 
are models of r. Therefore, all we need to show is that there exists an interpretation 
A/o € M such that A/ 0 is a model of P, and consequently M[P) £ M[r). 

It is enough to show that pt is not generated by P and <pT q n r. Mo will 

then be the least model Of P that consists of <pr, . . . , <p ; r. To derive a contradiction, 
suppose that pr € P(<?ir, . . . , q^r), and let d be a derivation of pr. We can assume 
that d is minimal, i.e., it does not contain two identical nodes n i and n 2 such that 
n i is an ancestor of n 2 . We create, a derivation d! of pairs corresponding to d by 
supplementing each goal-node of d with a constraint. The constraint attached to each 
leaf in d is Cf ~ , and the constraint attached to each nOn-leaf node is the conjunction of 
the constraints of its subgoals and the interpreted literals of the rule applied. Denote 
the resulting derivation by d ' . Clearly, all the constraints in pairs of d ' . are satisfiable, 
because V’r satisfies c r and .is a valid derivation of pr. Furthermore, we show bottom- 
up induction on d \ that. the nodes derived in d' would be derived byour algorithm.- 
Specifically, we show that if (pr, cr) is a node in <f, then.the pair ( q6,c8 ) would have 
been derived by our algorithm, where Q is the mapping we used for the variables of 
r. The claim, holds for the leaves of d\ since they all have atoms from pir, . . . ,q n r, 
and our algorithm began with the pairs (<?i0,c r 0),. . .,(q n d,c r 0)..The inductive case 
follows from the observation that since all argument positions have distinct variables 
in rules of P, any rule application that was done in d would have been done .by 
our algorithm (because the unifications did not. rely on additional equality between 
constants, that may have existed in V'r and not in Y6. All equalities were made . 
explicit as separate subgoals in the rules). However, the fact that the root of d' was 
derived leads to a contradiction. Since Yf satisfies the root of d', there would be an 
i such that c, f= V'r. | 

Our bottom-up evaluation of a program with a database containing facts that are 
pairs of an atom and a constraint is reminiscent of the procedure used by Kanellakis 
et al. (Kanellakis el al. y 1990], In their procedure, an EDB fact may be a generalized 
tuple specified in the form of a constraint on the arguments of its predicate. However, 
there is a key difference between the. two methods. In [Kanellakis et aL, 1990], the 
constraint Specifying a tuple considers only the arguments of the predicate involved. 
In our procedure, the Constraint appearing in a pair is a constraint on all the constants 
that appear in the initial database, i.e., all the constants in Y'0, where Y is the set 
of variables of rule r. Thus, the constraint of a pair may have constants that do hot 
appear in the atom of that pair. The following example illustrates why their method 
cannot be applied for detecting uniform containment. 


6 Note that wo are assuming here that the subgoals are rectified, i.e., all equality constraints on 
the variables in r appear in Cf. 
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Example 5.24: Consider rules r arid s: 

r : qiiX.Y) AqiMV)* p(X.Y) 
s : <?i(.Y, Y) A <j 2 ({/, V) A {U < V'') => p{. X. V) 

To check whether M(s) C A/(r). we evaluate s with the pairs (<?i(.t 0 . y 0 ), True ) and 
(y 2 (u 0 , t>o), Trut). If. we use the procedure of (Kanellakis et al.. 1990], the result is the 
pair (p(xo,yo)i True), which has no recording of the fact that its derivation required 
that uq < i’ 0 . Consequently, we will conclude erroneously that M(s) C Al(r) holds. 
In contrast, when our procedure applies rule s to the pairs {qi{xa,yo), True) and 
(y 2 (u Oit’o), True), the result, is the pair [p(xo.yo), < u 0 ), making it clear that s 
does not contain r, because True Uo < v 0 . I 

Complexity 

The complexity of the uniform containment algorithm is in the worst case exponential 
in the arity of the predicates in the programs. It depends on two farters: 

1. The number of pairs generated during the evaluation of V. — 

2. The complexity of checking whether c r \= c x V . . . V holds. 

The number of pairs generated during the evaluation of V may, in the. worst case, 
be exponential in .the number of variables of r. This is because the number. of nom 
equivalent constraints on n constants is exponential in n. The complexity of the 
second part is also at most exponential in the sum of the number of variables in r 
and the number of constants appearing in V. 

5.3.2 Uniform Equivalence with Stratified Negation 

In this section, we describe how to test uniform equivalence of datalog programs with 
safe, stratified negation. We begin with the case of stratified programs with neither 
constants nOr interpreted predicates. By definition, two programs P\ and P 2 are 
uniformly equivalent, denoted Pi = u P 2 , if for every database D (that may have both 
EDB and IDB facts), P\{D) = P 2 (D). Note that applying a stratified program to a 
database that may also have IDB facts is done stratum by stratum, as in the usual 
case; in other words, P{D) is the perfect model of the program P and the database 
D (of. [Ullman, 1989]). 

Suppose that Pi and P 2 are not uniformly equivalent. Hence, there is a database 
Do such that Pi (Do) ^ P 2 (D 0 ); Do is called a counterexample. We may assume that 
P\.(Oo) 2 P 2 (D 0 ) (the case P 2 (D 0 ) 2 Pi (Do) is handled similarly). 

We assume that both P\ and P 2 have the same set of EDB predicates and the same 
set of IDB predicates, and moreover, there is a partition of the predicates into strata 
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that is a stratification for both P\ and P 2 . In particular, .we assume that the lowest 
stratum consists of just the EDB predicates, and we refer to it as the zeroth stratum. 
We derote by the program consisting of those rules of Pi with head predicates 
that belong to the first i strata; similarly for P 2 . Note that P° is an empty program 
(i.e., it has no rules). By definition, P°{D) =* D for everv database D\ similarly for 

n- 

We now assume that for some given u P[ = u P 2 , and we will show how to test 
whether P{ +x = u Pj +1 . The algorithm is based on the following two lemmas. . 

Lemma 5.25: Suppose that there is an i,.such that P,‘ = u P 2 . If there is a coun- 
terexample database D 0 such that P, l+1 (Do) g P 2 +l (D 0 ), then there is some rule r of 
Pf +1 With .a head predicate from stratum i+1 .and a database O., such that 

1. D is a model. of P 2 +1 but not a model of r; and 

2. The number of distinct constants in D is no more than the number of distinct 
variables in r. 

Proof: Let D'. == P 2 (Aj): note that since D‘ is a perfect-model, D 1 = P 2 (D'). By the 
assumption in the lemma, P{(Dq) =• P 2 ( At>) and,. hence, D' is also a counterexample, 
i.e., P 1 I+1 (Z) / ) g P 2 l+1 (A-). Now let A.= Pf rl (D t ). Observe that A and D' have the 
same set of facts for predicates of the first i strata, since D' = P 2 (A'). In addition, 
observe that D' C A. These observations imply that ~Pj ,+l (D') C P{ +l (£t), Thus, 
P{ +l {D) % P^ +1 ( A), -because P; + ‘(Z)') £ P'+ l (A') and P’ +l (£>') = Pl +l (D). 

So, we have shown that Pf +l (D) g P 2 +1 (D), and A is a model of P 2 * +l . Therefore, 
there is a rule r in P, 1+1 of the form. 

q x A . . , A q m A ->si A ... A ->S; h 

and. a substitution 9 such that 

• the -predicate of h is from stratum i+1, . 

• 9 is a mapping from the variables of r to constants, 

• q x 9 € A (l.< i < m), . 

• Sj9 $ D (1 < j <*/), ahd 

• h$ <t A. . 

The above and the fact A = P^ff ( A ). imply that the database PUs a model of 
P 2 +1 but not of r ,+1 . 

Let D be the database consisting of facts from A that, have only constants from 
r '9. Database D is also a model of P 2 +1 . In proof, suppose that D is not a model of 
P 2 +l . Thus, there is a rule r of P 2 ,+1 and a substitution f, such that 
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1. the head h of r satisfies hr &.D, 

2. every positive subgoal q of f satisfies qr € D , and 

3. every negative subgoal .s of f satisfies sr £ D. 

By the definition of D, if g is a ground fact having only constants from D, then g € D 
if and only if g € D ; moreover, for every negative subgoal s, the constants appearing 
in sr are all from.D. since rules ate safe (cf. [Ullman, 1989]). Therefore, items (l)-(3) 
hold even if we replace D with D, and so it follows that D is not a nhodel of. r — a 
contradiction, since D is a model of P 2 +1 , and f is a. rule of P 2 +1 . Thus, we have 
shown that D is a model of .P 2 +1 . Furthermore, items ( 1 )— (3) above imply that D is 
not a model of r. So. the lemma is proved. I 

Lemma 5.26,: Suppose that P( = u Pj. Moreover, suppose that there is a database 
D that is a model of P 2 +l and is not a model of some rule r of Pj +l having d head 
predicate from stratum i + 1. Then P\(D) 2 P 2 (Z>), and, hence, Pi,=£ u Pi- 

Proof: From the .assumptions in the lemma, it follows that rule r can be applied to 
D to generate a new fact g that is. not already in D. Note that g # P 2 (D), since 
P 2 +l (Z>) = D and strata higher than i + 1 .cannot derive -new facts with the same 
predicate as that .of g. If we -show .that rule r. can still generate g even when. Pi is_ 
applied to D , it would follow that g <S Pi(£?), and hence, Pi{D) 2 P 2 (D). To show, 
that, recall .that P[ = u P 2 l and D is a model of P 2 ,+l ; therefore, D is also a. model of 
Pf. Thus, rule r can still generate g during the application of Pi to Z), since nothing 
is generated by rules of lower strata.. I 

The algorithm of Figure 5.1 tests Whether P\ =. u P 2 ; its correctness follows from . 
the above two lemmas and the following proposition. 

Proposition 5.27: . P\[D) ^ P 2 (/)) if and only if there is dome i and a database D 
such that either P,‘(P) 2 P±{D) or P^{D) <t P[{D). 

Proof: Clearly, if P\{D) £ P 2 (D ) then there exists some strata i such that either . 
p;(D) 2 Pi{D) or Pj(D) g P{{D). Conversely, if P{{D) 2 Pj(D), then P{ and P{ 
differ in some of the facts they generate from D for the first i strata.. Therefore, by 
Lemma 5.26, P,(P) # P 2 (D).'| 

Note that in the algorithm, it does not matter what the are constants in S as long 
as their number is equal to the number of distinct variables in the given rule r. Also, 
if. two databases over constants from 5 are isomorphic* it is sufficient to consider just 
one of them. In the algorithm we need to check whether a database D is a model of 
P 2 arid not of r. This is done .by verifying that P 2 (D) = D and r(D) ^ D . 
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procedure chick [Pi. P2): 

begin 

for every rule r of Pi do 

begin 

Let 5 be a set of v distinct constants, where e is the number of variables in r; 
for every database D that includes only constants from S do 
if D is a model, of P 2 but not of r then return false: 

end; 

return true; 
end; 

begin /* main procedure */ 

for i := 1 to max-straturfi do 

if not check (P{, P 2 ) or no t che ck ( Pi. PI) then return Pi P 2 ; 
return Pi = u P 2 .; 
end 


Figure 5.1: An algorithm for testing Pi P 2 . 

Example 5.28: Let P x consist of the rules: 

t'i : own (.\ , V) Iown( A', Y) 

r 2 : lives{X, Z) A inHouse(Z,Y) =► lown(X.Y) 

r3 : own{X, Z) A lives{}‘. Z) A Iown[Y y U) Iown{ X, U ) 

1*4 '■ likes(X , V') A ^lown(X y Y) => buys{X, Y) 

Let P 2 consist of the rules r u r 4 and the rule: 

■r$ : oit’rc(.Y. Z) A inHouse[Z y Y) => lown(X,Y) 

The EDB relation own describes an ownership relationship between persons and 
objects. The IDB relation I own represents a landlord’s perspective of the ownership 
relation; The programs P x and P 2 are not uniformly equivalent; Specifically, consider 
the database Dq\ 

{likes(a.o). lives{b.h) y o\vn(b y o ), own(aJi )} 

The programs P\ and P 2 differ already in the first stratum (in .which. the relation 
lown is computed), since fown(a.o) # P 2 (Dq) whereas /omn(d.o) 6 P,(D 0 ). In the 
second stratum, when we compute the relation buys, we get buys(a y o) € P 2 (Do) and 
buys(a,o) % Pj.(Dd). I 
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To extend the algorithm to programs with interpreted predicates (and constants 
in the rules), we need to be careful about checking whether a database D is a model 
of Pi and not of r. Iti order to check this we need to specify the. interpretations of 
the interpreted predicates on the constants in 5. For example, suppose r is the rule 

e(.Y. V) A (.V < Y)=*p[X.Y) 

and Pi consists of the single rule 

c(.Y, V') A (.V > V') =$> y[.Y, V). 

Consider the database D consisting of the fact e[x 0 , yo). If Xq < yo holds, then 
D is a model of P 2 and is not a .model of. r. However, if Jo > yo, then D is not a 
counterexample. - 

One conceptually simple (albeit not the most efficient) way to address -this subtletv._ 
is to. try all possible interpretations to find one in which the database is a counterex- 
ample. The interpretations can.be viewed as supplements to the given .database. In _ 
the case of dense-order constraints, we would do the following. Let C be the .set of 
constants appearing in either Pi or /? 2 . Instead of considering every database over 
constants from 5, we should consider every database over constants from S U C- 
Moreover, for each database, we should consider every total order on the constants 
of the database such that the order is consistent with any order that may implicitly 
be defined on C (e.g.,, if C. is a set of integers, then presumably the usual order on - 
integers should. apply to C).. For each such .pair consisting of a database and-total. 
order defined on. its constants, we check whether the pair is a model of P 2 and not of 
r. Consequently, we get the following theorem: . 

Theorem 5.29: Uniform equivalence for datalog programs with safe , stratified nega- 
tion and interpreted predicates is decidable. . 

Proof: First, it should be noted that Lemmas 5.25 and 5.26 and Proposition 5.27 
hold also when the rule's have interpreted literals. The only difference is that the 
number of objects in the counterexample may the size of S U C. All we need to show . 
is that trying all the consistent possible interpretations of the interpreted predicates . 
for the constants in SuC will suffice to find a counterexample if there is one. In 
proof, suppose we found an interpretation / such that DU / is a counterexample; In 
that case, simply replace the constants in S with constants from the domain of the 
interpreted predicates such that the constants satisfy the same interpreted relations, 
as the objects in S’and arc consistent with,/. The result will be a counterexample 
database. Conversely, suppose there is some counterexample database D. Simply map 
the constants in S to the corresponding constants in 0, and interpret the interpreted, 
predicates oti S in the sa me way that the corresponding constants in D are interpreted. 
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Clearly, our algorithm would have tried that interpretation for the constants in 5, 
and would have found the counterexample. I 

A more efficient method for checking whether a database is a counterexample is 
to use an algorithm similar to the one used for uniform containment with Interpreted 
predicates. We evaluate r and Pi with pairs consisting of an atom and a constraint. 
The initial pairs are {g, True), where g 6 D. Let (<7i, Cj {gi, c/) be the pairs 
derived by r and for which g, $ D . Similarly, let {h\,d \),. , , ,{h k ,dk) be the pairs 
derived by P 2 and for which h, £ D. Clearly, D is a model of P 2 if and only if 
interpretations of the interpreted predicates satisfy ->(di V ... V d*), sinCe under these 
constraints, P 2 does not derive any new facts when applied to D. Similarly, the 
database D is not a model of r if the interpretations satisfy (c t V . . . V c/). Therefore, 
D is a. counterexample database for the containment of r in P 2 if and only if the 
constraint 

~'{d\ V . . . V dk) A (ci V V c/) 

is satisfiable. 

5.3.3 Beyond Uniform Containment 

For. testing uniform containment of V\ in Vi, it is enough to. check. the containment 
separately for. each rule of V\. Consequently, uniform containment completely, ignores 
possible interactions between the rules which may imply containment of V\ in V 2 . 
Consider the following example. 

Example 5.30: Consider the following programs whose query predicate is p. Let V\ 
ber_ 

r, : q(X) A (X < 5) => p( -V ) 
r 2 : <S(.Y) A (,V >())=*• q[X) _ 

And let Vi be the program: 

r 3 :q(X) A(.V < 6) A (.V > 0) =* p{X) 
r 4 : e(.Y) A (.V > 0) => q(X) . 

The program V\ is contained in V 2 , because whenever 0 < .Y < 5; the atom p(X) 
will be derived from V 2 if. e(.Y) is in the database. However, r ( is not uniformly 
contained in V 2 (and, therefore, V\ 2 U V 2 ). For example, the model consisting of 
{qr( — 1 ). e( — l.). -'p(-l)} is a model of V 2 but not a model Of V\, I 

The weakness of the uniform containment test stems front the fact that it considers 
containment of the set of all models, while in order to prove (ordinary) containment, 
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it is sufficient to consider containment of only the minimal models. 7 Therefore, there 
may be cases in which containment of minimal models holds, but containment of 
all models does not. To get a stronger test, we may try to transform V\ into an 
equivalent program V with a larger set of models (but, of course, the same set of 
minimal models, since equivalence must be preserved.). We may then be able to show 
that M ['P , 2 ) Q M{V) holds, where A/ ( 7^2 ) Q A'/^i) failed. One way of doing this 
transformation is by propagating constraints from one rule to another by using the 
rules given by the query-tree. In our example, the result of constraint propagation is 
the following program V: 

r[ : q{ X) A (.V < 5) A (.V > 0) => p{. V) 
r'.:e(A')A(.Y>0)=*q(.Y) 

Now we can show that V C u Vi. and since V\ = V. it follows that V\ C u Vi. 


5.4 Conclusions 

In this chapter we studied the problem of detecting independence of queries from 
updates. We provided insight into the. problem by relating it to the .problems of. 
detecting irrelevance and equivalence of datalog programs. As a consequence of this 
connection, we made several contributions: 

1. Provided algorithms that guarantee sufficient conditions..for detecting indepen- 
dence, based on strong irrelevance. 

2. Showed additional cases in which detecting independence is decidable, and gave 
efficient algorithms for doing so. 

3. Showed cases where independence of an insertion is equivalent to independence 
of a deletion, thereby making the latter easier to cortipute. 

Viewing the problem of independence from the perspective of irrelevance and 
equivalence also suggests that further algorithms for independence can be found by 
considering other sufficient conditions for weak irrelevance and equivalence. One such 
direction, based On extending uniform equivalence, was discussed in Section 5:3.-3. 
Other sufficient conditions can be achieved by considering strong irrelevance based on 
minimal derivations, as described in Chapter 3. This chapter also made contributions 
to the problem of detecting equivalence of datalog programs, which is a recurring 
problem in query optimization. . 

7 In bur formalism, a set of relations for llie-EDB and IDB predicates is a minimal model if the 
IDB part is a minimal model once the BDB facts are added to the program as rules with empty 
bodies 
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5.4.1 Related Work 

As discussed throughout the chapter, Blakeley et al. [Blakeley cf al. t 19S9] and 
Elkan [Elkan, 1990] have studied the problem of independence. In summary, they 
have considered the problem for restricted languages in which strong irrelevance is 
the same as Weak irrelevance. Blakeley et al. consider non-recursive knowledge bases 
without interpreted predicates. Elkan generalizes these results and provides a deci- 
sion procedure for strong irrelevance in the case of non-recursive knowledge bases. 
Our results extend Elkan's by providing a decision procedure for strong irrelevance 
(and therefore, independence) for arbitrary datalog programs. .Furthermore, we shbw 
additional cases in which independence is decidable, specifically for arbitrary non- 
recursive knowledge bases, it should be noted that Elkan also suggested a proof 
method for detecting independence in the recursive case; however, .he provides no 
characterization of the power of that. proof method, but it- should be rioted that it 
cannot capture all cases detected by the query-tree. 

Our work also generalizes previous work on containment of conjunctive queries 
with interpreted predicates by Klug (Klug, 1988). Klug showed that if all. the con- 
straints are left-semiiuterval or all constraints. are right-semiinterval ,? then Contain- 
ment of conjunctive queries can be decided .by finding a homomorphism from one 
query to the .other. For general, conjunctive queries, he pointed out that it could 
be done bv finding a homomorphism for every possible ordering, of the variables and. 
constants in the queries. The number of such orderings is exponential in the number 
of variables appearing in the constraints. Recently, van der Mevden [van der Meyden, 
1992] has shown that the containment problem of conjunctive queries with order con- 
straints is n^-complete. In our algorithm, the complexity depends only on the number 
of orderings that are actually generated during the evaluation of V. More precisely, 
our algorithm generates partial rather than complete orderings of the variables and 
constants in the queries; Essentially, it lumps together complete orderings that need 
not be distinguished from each other in order to test containment. Therefore, our 
algorithm is likely to be better in practice, albeit not in the worst case. Of course, our 
algorithm also applies to more than just conjunctive queries by considering recursive 
programs as well. 


B See [Klug, 1988] for pre'ctse definitions of these restrictions. 


Chapter 6 

Irrelevance and Abstractions 


6.1 Introduction 

In the previous chapters we discussed irrelevance claims whose subject was formulas* 
When we detected that a formula was irrelevant. to a query, that served as & justifica-. 
tion for ignoring the formula when we searched for for an answer to the query. Ignoring 
an irrelevant formula Can be viewed as a simple way of abstracting a knowledge base. 
This view suggests that more interesting abstractions can be obtained by considering 
other kinds of irrelevance claims, based on different subjects of irrelevance. 

Research on reasoning with abstractions focuses on finding abstractions that will 
yield more efficient inference. The intuition underlying much of that research is that, 
a good abstraction is one that removes irrelevant detail. If the details removed iii the 
abstraction are indeed irrelevant, then the solutions obtained from the abstract knowl- 
edge base will hold in the original knowledge base, or can be refined in a structured . 
fashion to solutions from the original knowledge base. 

This chapter proposes an approach to research on abstractions that exploits the 
connection between the notion of irrelevance and the creation of abstractions. It 
makes the first steps in formalizing this connection and describes the possible payoffs 
from the approach. We begin by illustrating two new irrelevance subjects and the 
abstractions that they justify. The potential payoff of the proposed approach is 
demonstrated by considering these examples in detail in Sections 6.3 and 6.4. The 
first example illustrates the irrelevance of jnxdicdtc arguments' 

Example 6.1: (’oiisider the following rules describing flight routes between cities. 
The third argument of -flight and route denote costs of tli < flights, and their fourth 
arguments dcriote the airline of the flight /route. 

n : fliyht{ A'. V. (\ A) =>• foulc( A', C\ A) . 

r 2 : flight{X . ZJ\,A) A foUi({Z,Y t C'- it A) =* raii<e(A\ V.C, T CV .4) 
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: twite (.V, V, C\ .4} => dirlineFlight(X , V. .-1) 

The knowledge base also contains a set of ground atoms for the predicate flight. 
The atom (iiftincFlight(A y \ , A) denotes that there is a rout? (i.e., some sequence 
of flights) from ,\ to f that uses only airline ,4. Suppose we want to query the 
knowledge base for the existence of a flight from SF to LA on I\ 2 airlines, i.e., 
airline Fiight(S F. LA. A 2 ). The costs of flights are irrelevant to this query, i,e., the 
third argument of flight and route can be projected out to yield a smaller knowledge 
base and search Space. Specifically, we can rewrite the rules as follows*: 

r\ : flight'iX.V.A) =4* routc f (X. Y. A) 

r 2 '■ flight'{.\,Z.A)Aroutc t (Z.},A) =*► routc'(X,Y. A) 

?’3 : i'outc'[X . Vi .4) =4> airline Flight{ X. Y. ,4) 

V\e also project out the- third argument in the ground atoms of the. predicate 
flight. Consequently, multiple flight facts describing different fares for the same flight 
are collapsed to one fact in the knowledge base.. As a result, the knowledge base will 
contain fewer facts and simpler rules And therefore the space searched mav be signifi- 
cantly smaller than_in the original one. For example. Consider the difference between 
the rules r 2 and r' 2 . In. rule r 7 . if \Ve fail to join a ground atom flight(x.z.c\.a) with 
a ground atom route(z. y, c 2 ,«),.the backward chainer might still try to join the.atom 
flight(x , 2, Cj , a ) with an appropriate atom of route for every value. c\ for which it 
finds an atom in the flight database, and will fail. on. all of them. In contrast, rule r' 
will not try other costs for the same flight route. 

Although in some simple cases these repetitions can be eliminated by employing 
some method of dependency directed backtracking, such methods will not be as gen-, 
eral as projecting out arguments and will also have additional costs associated with 
maintaining the dependencies. I. 

Flic following example illustrates the irrelevance of a predicate refinement: _ 

Example 6.2: Consider a knowledge base with the following formulas: 

t’i : ,s portsCar(X) caf(X) 

f '2 : fdmily('af(X) =4> cdf(X) 

: ear(X) => vchiclc(X) 
r,, : bicycli(X) =4- vchiele(X) 

(’a : Apart sC(ir{ X) =4 high Risk ln$urancr,(X) 
t'n : cur[ X ) =4 has Mot or { X) 

r: : vehicle(X) A hasMotor(X) =4- Tiwtori:cd\'rhicle(X) 
fifl : familyCaf{Cdinry) 
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Consider the query mot&t'jzf-dVehiclclCamry), With respect to that query, the 
refinement between sports cars and family cars is irrelevant. Intuitively, all that 
matters about the Oamry is that it is a car. Consequently, the query can be solved 
if we abstract the representation of the domain by removing the refinement between 
the predicates sport sC or and familyCrir . To do so. we remove rules 7*1 and r-} and 
replace g\ by the formula: 

g\ : ear(Cgmnj) 

The rule 7's is also removed because it distinguishes between the different types 
of cars, and is therefore irrelevant. Abstracting the representation will yield more 
efficient inference for several. reasons. First, it is no longer necessary to derive that a 
Camrv is. a car. In the example, there. are only two rules that may be used. to derive 
that, but in general, there may be many different subclasses of cars and .the cost of 
deriving ear(Camry) may be arbitrarily , large. Second, by removing the formulas 
that distinguish between the types of. cars, we reduce the size of the space that needs 
to be searched. I 

Recently, research on abstractions and approximations has received renewed at- 
tention (Ellrnan, 1990; Ellman, 1992:. Lowry, .1992]. However, two key problems in 
this field. remain largely open. The first is how a system can automatically create 
an abstraction that is well suited to a particular query. -The second challenge -is 
understanding the utility of reasoning with multiple levels of abstraction. Our ap- 
proach addresses these issues as follows. When an abstraction is being considered, 
our approach is to articulate which knowledge is being removed in the process of the 
abstraction and to justify the abstraction by the fact that this knowledge is irrele- 
vant to the query at hand. Reasoning about abstractions then becomes a problem 
of reasoning about irrelevance. The formal analysis of irrelevance will give us several 
insights into the corresponding abstractions: 

1 . The problem of automatically generating abstractions becomes well defined as 
a problem of automatically deriving irrelevance claims. Often this can be done 
by using existing algorithms for automatically deriving irrelevance claims, as 
we see in Sections 6.3 and .6.4. 

2. Understanding the utility of exploiting irrelevance claims gives us insight into 
the utility of the abstraction based on it. For example, if an abstraction is ‘based 
on a weak irrelevance claim, then it is not necessarily computationally advan- 
tageous, whereas if it is based on a strong irrelevance claim; it is guaranteed to 
lead to computational savings. Furthermore; the underlying irrelevance. claims 
can indicate whether abstractions can be. composed, based on composing the 
underlying irrelevance claims. 
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3. The ability to explicitly state the irrelevance claims underlying abstractions 
provides us with a formalism in which we can reason about abstractions. This 
is useful in several scenarios: 

• In general, we cannot expect a single abstraction hierarchy to be well 
suited for all possible queries. Therefore we need to tailor our abstractions 
for specific queries, and in doing so -we can often be aided by additional 
knowledge about the domain. Expressing such knowledge in the form of 
irrelevance claims and combining it with other knowledge about the domain 
provides a powerful mechanism for incorporating dom ain knowledge into 
the creation of abstractions. 

• Several problem solving scenarios give-rise to situations in which we are 
given multiple descriptions of aspects Of a domain at varying levels of ab- 
straction. In such a situation our task is twofold. -First, we need to. select 
the level of abstraction that is best Suited for a given -query, and second, 
we need to combine descriptions of different aspects of .the domain to Cre- 
ate one coherent and consistent, description. By stating explicitly the as- 
sumptions underlying the multiple descriptions, we can reason. about their 
consistency and adequacy. The following are examples of such scenarios: 

(a) Reasoning about physical systems: In this domain (discussed in 
detail in Chapter 7) we are given descriptions of. physical phenomena 
in the world at different. levels of abstraction. Our task is to compose 
descriptions of relevant, aspects of the system such that we can answer 
a query about a given system. For example, suppose we are composing 
a representation for a given device that includes a battery connected to 
a wire, each of which can be described at different levels of abstraction. 
In particular, we can describe each of them under the assumption that 
theif electrical properties are irrelevant or without that assumption. 
Reasoning about the assumptions underlying these descriptions will 
ensure that, we do not compose a description of the battery that ignores 
its electrical properties (e.g., its voltage) with a description of the wire 
that considers the voltage of the battery relevant. 

(b) Reasoning with contexts: Contexts [Guha, 1991] are Small the- 
ories that describe limited aspects of the world. A knowledge base 
describing a complex domain can benefit from .being divided into con- 
texts both .in simplicity of representation and efficiency of reasoning. 
Here too, .answering a query, requires' that we decide which contexts 
are relevant to the query and are consistent with each other. This can 
be done by reasoning with. explicit statements about the assumptions 
underlying each context. 
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(c) Distributed heterogeneous databases: A distributed database 
(whether centrally managed or employing a federated architecture) 
may contain several databases with overlapping data. The databases 
may describe the data with different levels of abstraction and assump- 
tions. As in the previous two cases, given a query, the system must 
find the (parts of the) databases that are needed to answer the query 
and must combine knowledge from the different databases to provide 
a coherent answer. An additional challenge here is to minimize the 
costs that may be associated with accessing remote databases. 

It should be noted that in the latter two examples, before we can reason 
about the abstractions underlying the different contexts or databases, we. 
must resolve the semantic mismatches between the descript ions.~used in 
each context /database. 1 

6.2 New Irrelevance Subjects 

As explained, abstractions can be obtained by considering new kinds of .irrelevance 
subjects. This section describes informally several such subjects and shows how they 
account .for abstractions with which we are familiar. The. irrelevance subjects .that - 
we discuss are broadly divided into, two classes, one. that concerns relations in the- 
domain (and their corresponding predicate symbols! and one involving objects in the 
domain. The irrelevance subjects concerning -relations include the following: 

• Predicate irrelevance: We may state that a certain predicate (representing a. 

relation in the domain) is irrelevant to a query. For example, if we are modeling 
the behavior of a battery for a short period of time, the property of being a 
rechargeable battery is irrelevant. Such an irrelevance claim can justify simpli- 
fying formulas by removing literals containing the irrelevant predicate (or by 
removing formulas completely). 

• Predicate Refinement: Irrelevance of predicate refinements can appear in . 
two forms. In the first, illustrated in Example 6.2 (and further investigated 
in Section 6.4), we have a set of predicates <71 , . . . , c/ n and an irrelevance claim 
stating that the given 7, is irrelevant. That means we only need to know that 
a certain object (or tuple of objects) belongs to one of the <?,*’ s, but not which 
one. Consequently, the abstraction .will replace r/i, . . . K q n by a new predicate 7 
intended. to denote the union of the interpretations of 71 ,. . . , 7 „. 

1 In research 6r< model ng physical systems, most works make the assumption that all the descrip- 
tions of tlie domain are based oil one consistent bntologv. 
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In the. second form, we want to remove 9t, . . . ,<jr n because we are only interested 
in objects that belong to all the q t '$. Consequently, we will replace q\ . ,q n by 

a predicate that denotes the intersection of the relations denoted by q\ q n . 

For example, we may have two predicates adult and citizen. However, in a 
theory that represents the domain of elections, we wish to represent only objects 
that Satisfy the intersection of these two relations. Such an abstraction will 
enable us to remove many formulas as well as save computation of intersections 
(or more generally, joins of relations). 

• Predicate argument: As we Saw in Example 6.1, we can often simplify a 
representation by reducing the aritv of some predicates (i.e., projecting them 
on a subset of their arguments). Section 6.3 will discuss this subject in detail. 

Irrelevance Subjects .concerning objects in the domain include. the following: 

• Object irrelevance: We may state that a certain object in the domain is 
irrelevant to a query. For example, we may-state that, the battery pf the car 
is irrelevant to.a query regarding its transmission system. Consequently, we 
can ignore formulas in the knowledgebase, that include constants (or terms) 
denoting the irrelevant object. 

• Object refinement: As with, predicate refinements, we can state that a refine- 
ment between objects is irrelevant to a query. Given a set of objects a u . . . ,g„, 
we. can replace them with a single object a. As in the case of predicate refine- 
ments, there are two ways we can interpret a. The first is to assume that a has 
only those properties that are Common to each of the a,'s, and the second is to 
assume that a has any property that any of the a,'s has. 2 For example, sup- 
pose we are reasoning about a chemical reaction. We do not need to represent 
each molecule in a given solution. Instead, we reason with one representative 
molecule of each different type. W'e ascribe to a representative molecule all the 
properties that are shared by all molecules of its,.tvpe. 3 

• Object aggregation: A common abstraction that arises in many contexts 
is aggregation. Instead of representing a set of objects, we represent only a 
Single object denoting their aggregate. For example, instead of representing 
the. parts of a Chair, we can represent the chair as a single object. We can use 
this representation when the properties that are relevant to the query are only 
those that apply to the. aggregate and. not to its subparts. Note that object 

-Of course, special care must be given to formulas of the form a\ ^ a 2 . 

3 0:ie can also devise methods for abstracting object refinements that lie in the middle of these 
two extremes. For example, we can associate With the representative object only the properties that 
are typical of the set it is representing. 
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aggregation is different, from object refinement. Here, the new object represents 
the aggregation of a set of objects, while in the case of object refinement the 
new object denotes a representative of objects in a set. 

• Object homogeneity: In object refinement, we replaced a set of objects by 
a single one. However, in some cases we may need to retain all the objects 
(e.g., their number is important), but we want to represent them as a set of 
homogeneous objects. This means that we abstract all the differences between 
them except for object, identity. For example, consider the 15 puzzle. A powerful 
heuristic for solving the puzzle is to place the first tile in place, and then proceed 
to place the second, (while keeping the first in a fixed position),, etc. For the 
subgoal of placing the first tile in place, there is no need to distinguish between 
the tiles 2--15. Two states that differ only in the location of one of these tiles 
should be indistinguishable. . Making this abstraction reduces the number of 
possible states from 16! (« 10 1 . 3 ) to 15_x L6 = 240 states. 

In addition to. irrelevance subjects concerning relations and objects, we can also 
consider subjects that abstract function symbols. In some domains, we can also 
consider more Specific subjects. For example, in planning .we can consider irrelevance 
of states, actions or action preconditions 

6.2.1 Defining Irrelevance of New Subjects 

As a basis for the approach we are proposing, we need to make a formal connection 
between abstractions and irrelevance. This section shows how irrelevance of new 
subjects can be formalized in the framework we discussed in Chapter 2. 

The definitions of irrelevance that we. have considered were based on the intuition 
that a subject <p is irrelevant to a query if <fr can be removed without changing the 
answer to the query. In the case of being a formula, removing </> meant literally 
removing it from the knowledge base (or revising the knowledge base so it does not 
entail <p). For the new irrelevance subjects, although we have an intended abstraction 
in mind for removing <p, the actual removal involves subtle details. For example, in 

the case of irrelevance of a predicate refinement {q i, q n j, we would remove it by 

replacing all occurrences of q { by a new predicate q. The intended interpretation of 

q is the union of the interpretations of {q { r/ ri }. However, in doing so we must 

be careful. For example, if we are removing the distinction between the predicates 
{npoft#Cal\ familyCar}. and we have the formula 

f<nnilyCar(.X ) => -'Sport*('(ir{. V) 

then performing the substitution would result in the contradiction 
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car(X) =f> -i car[X ). 

Assuming that we have some function Abs 0 for abstracting a given knowledge 
base A which does not introduce unwanted formulas, our intuition of irrelevance 
would imply that o is irrelevant to a query q w.r.t. a knowledge base A if 

A b q .46.s*(A) h Abs 0 [q). (6.1) 

In words, 6 is irrelevant to the query q if abstracting the representation by remov- 
ing 0 (resulting in the knowledge base Abs^(X)) does not change the derivability of 
the query. As we Saw' in Chapter 2. a more refined account of irrelevance, based on 
a proof theoretic analysis, enables us to address the key issues regarding irrelevance, 
In -this case, we w‘ant our analysis to help in developing algorithms for automatically 
justify ing .and creating_abstractions and in analyzing the utility of reasoning with 
abstractions. 

We now extend the framework described in Section 2.3 to new irrelevance Subjects. 
Recall that a definition of irrelevance in our space was obtained by considering some 
condition Dl (which depended .on the subject <b) over a chosen set of derivations of 
the query P 0 . .We said that is strongly irrelevant to a query q if £>/(<?, D) holds for. 
all derivations D € Po< and that it is w’eakly irrelevant to q if Z)/(d>, D) holds. for some 
derivation D € ZV_To extend the framework, W’e.consider appropriate definitions of 
the predicate DI. 

The definitions we consider for Dl A are -based on identifying formulas that are- 
independent of the. irrelevance subject o. Intuitively, a formula is independent of o if 
it does not rely on <z>, i.e., it holds even if <p is removed. The definition of independence 
will also be used to define the abstract knowledge base .4Es^(A) such that.it does not. 
introduce unwanted formulas. Specifically, v46s$(A) will contain the abstractions of 
the formulas in A that .are independent of <p . Formal definitions of independence will 
be given in subsequent sections. The following examples motivate the concept. 

Consider Example 6.2 and the predicate refinement {sportsCar^ f amilyCat} and 
the rule 

r : sport sC'ai'(X) vihiclc(X). 

The rule r is independent of the predicate refinement, because if we replaced occur- 
rences of sport sC'ar by lamilyCar, the Resulting rule 

r 1 : f(imihjC(ir{X).=> v<liicL(X ) ~ 

also follows from the knowledge base. Therefore, replacing r by the rule 
car(X) =£ rrhicl({X ) 


And we do not claim tliat these are tiic_o.it ly viable definitions. 
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would ilot contradict our previous knowledge or add to it. In contrast, the rule 
6 : spoftsCar{.\) => high BiskToIhsurc{X) 
is not independent of the predicate refinement because the rule: 
s' : fmnilyCar(.X) =*• high RiskToInsiirc{X) 

does not follow from the KB. Based on a definition of independence, we catt consider 
several definitions of DI in the same fashion we did in Section 2.3: 

Definition 6.3: 

• D! i(o. D) if Ba$e{Df does not. contain any formula that is not independent of 
O. 

• DI i{0, D) if D does not contain any formula that is not independent of o. 

• D I D) i (Basc(D) does not entail any formula that is not independent of o. 

I 


Returning to .Example 6.2. the predicate refinement o — { sportiCar , familyCor }„ 
is strongly irrelevant to the query q == moiorizedVehiclelCcmiry) because the (single) 
derivation of q uses only formulas that are independent of ©..On the other hand. © is 
not strongly (or weakly) irrelevant to the query q\ = high RiskToEnsure{X) because 
derivations of will use the rule r 5 . 

In .the .following sections. We. consider specific definitions of., independence (and. 
irrelevance) and show how they are used to develop algorithms for automatically 
creating abstractions. 

6.3 Irrelevance of Predicate Arguments 

As .illustrated in Example 6.1, it is sometimes possible to simplify a representation 
by projecting out arguments. of some predicates, thereby reducing "their arity and 
leading to more efficient reasoning. Intuitively, we can project out the arguments if 
they are .irrelevant to a query. An .argument is irrelevant to a query if the solution of 
the query requires souk values for that argument, but the actual values used are not 
important, and they do not have to satisfy any other .constraints. In this section, we 
will formalize the irrelevance of a predicate argument. 

We denote a set of predicate arguments by TC, which is a set of pairs [q,. where 
q, is a predicate arid u, is an integer less or equal to the arity of r/,. For example. the 


5 |{<Vall lliat liasr[D) is the set of formulas in the leaves of the proof tree (see Section 2.11) 
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{{fbight,3).(rouic,4)} represents the third argument of the predicat c flight and 
the fourth argument of the predicate route. In projecting out the set ft, we perform 
the following syntactic transformation f^[d>) to a formula <t> in the knowledge base: 


Let p( A i ) p (.\ m ) be the literals of the predicate p in o (both positive 

and negative). Suppose the aritv of p is / and that [pXi) (/;,?'*.) € ft. 

We introduce a flew predicate p 1 of aritv l — k. We replace each atom p{ X , ) 
of P by an atom p '() ,), where V, is the result of projecting arguments 

C out of A',. Note that p' may be of aritv 0. .If p is an order 

predicate we replace its atoms by True. 

The function is extended in a straightforward manner to sets of formulas. Note 
that if a predicate p appears in ft, then. We replace it with the. sarrm new predicate 
in every formula. As explained earlier, simply applying the substitution fa .to all 
formulas in the knowledge base may introduce inconsistencies. We therefore apply 
the substitution only to formulas that are. independent of R. Our definitions of 
irrelevance are also based on the notion of independence. 

Our definition of.independeiice is based on the semantics of the abstraction we are • 
performing. Specifically, if a predicate p denotes a relation P , aiid We project out Some 
of the argumeiits-of p, ..then the resulting predicate should denote the corresponding 
projection of P. Given an interpretation / for the symbols in a knowledge base A, 
we define an interpretation Abs{l) for the symbols in /tj(A) aft follows: 


• The interpretations of terms in / and Abs(l) are identical. 

• If the predicate p does nol appear in. ft, then.p is mapped to the same relation 
as in /. 

• If (/>, > i ). . . . , (/a i f.) € ft, and p was mapped to the relation P , then the predicate. 

V lfi mapped to I lie relation PC where P' is the result of project ing t he arguments . 
L in out of P. 

independence is defined. as follows: . 

Definition 6.4: A formula i.’ is inde pendent of t he predicate arguments ft if for any 
interpretation /: 

l fe A,.===» Abs{l) |= frCV). 
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Intuitively, a clause C is independent of Q if the formula Jq(C ) does not impose 
any additional constraints on the possible states of the world, as described by A (and 
therefore does not add any new knowledge). We define the abstract knowledge base 
resulting from removing V. from A. denoted by .46 Stj(A). It includes the abstractions 
of formulas iii A that are independent of 71 . i.e., 

Ab.< 7 ;(A) = {ffi(v) | V’ € A and v is independent of 7?.}< 

In Example 6.1. the rule 

r a : flight(X,Z.C l ,A)AroUtc{ZA\C 2 .A) => rotde(.A\ V. 0 + C 2 ..4) 

is independent, of the arguments, {{flight., 3). {route,, 3)}. but is not independent of 
the arguments {{flight. 4), {route .4)},. To see the latter, consider the interpretation 
/ in which flight denotes the single tuple relation {(rt.I), l.tu'a)} and route denotes 
the relation {(b, c. 1, united)}. The abstract interpretation Abs(I) Will map flight 1 to 
{{a..b, 1)} and route' to {(6, c. 1)}. Therefore. / is a model of r 2 . but Abs(l) is not a 
model of r 2 . 

Based on independence', we can define irrelevance using the definitions of D /.given 
in Definition 6.3. For example, wc-cati define K to be weakly, irrelevant, to. a query q 
if there, is some. derivation of q that contains only formulas independent of 71. The 
following theorem shows that weak. irrelevance provides a logical justification for the 
abstraction that, fits our intuitions stated in Equation 6.1, i.e., provides a justification 
for using ,46*7i( A) instead of A: 

Theorem 6.5: Let P,(A) be the sH of derivations of the query q from the knowledge 
base A, If W I(7l.q, A. DI\.V q ) holds then 

Ah q =* Absn(S) I- fn{q) 

and. if q contains no irreleva nt a m u in i iifs (i.e., q = fn[q)) then 

.46.S75 ( A) fn(q) ==* A }= i}. 

Note that if we have a complete-set of inference rules (e.g., backward chaining for 
atomic queries in Horn rule knowledge bases), then the above theorem implies' 1 ‘ 

A 1- q «*==$■ Absn(X) h /t i(«7).. 

'’Note-tha! wo arc assuming t lifoiiglinut that our inferrurr rules arc sound 
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Automatically Deriving Irrelevance of Predicate Arguments 

I lie following observation follows from the definitions of irrelevance and will form the 
basis for algorithms lor deriving irrelevance of predicate arguments: 

Observation 6.6: Let <t> be a set of clauses such that U7(4>, q, A. Dh.V 0 ) holds, 
and such that all clauses in A — <1> are independent of the arguments 7v. Then 
117(7^. r/. A, holds, and moreover, we caii abstract A by .-Ui.^fA - 1 $).. 

therefore, to derive irrelevance of a set of predicate arguments 7v. our strategy 
will be to find a set of clauses such that is weakly irrelevant to q w.r.t. A and 
all the clauses in A — $ are independent of "R. Note that ground atomic formulas, 
are always independent, of any set of predicate arguments. Therefore, we need only 
consider clauses -that .are not ground atomic, and our results will.be independent. of 
any changes-inade to ground atomic clauses in our knowledge base. 

f inding a. set $ can be done using, any of the methods described so far. For 
examp'c, in the case of Horn rule knowledge -bases, the query-tree can he used to 
find all the formulas. 4> that, are strongly irrelevant to the querv,_and (since strong- 
irrelevance entails weak irrelevance), the formulas 4> are weakly irrelevant to -the 
query as well. The algorithms in Chapter 5 can be used to detect- additional weakly 
irrelevant formulas that.are not detected by the query-tree. Finally, for general clause 
form knowledge bases, we. can. use connection graphs to derive sufficient conditions 
for strong irrelevance. 

To derive irrelevance of predicate arguments, we need to check that the formulas 
in the set A - 4> are independent of a set of predicate arguments R-. To facilitate this 
check, l lie following theorem provides a syntactic condition for independence. We 
assume that a formula C is given in clause form (cf. [Geiiesereth and Nilsson, 1987]). 

A literal in a clause is negative if it is a negation of an atomic formula (e.g.. ~>q[.\) 
is a negative literal, while p(.V.V) is a positive literal). Ncg[C) {Pos(C)) denotes 
the set of negative (positive') literals in a clause C. We assume that N< g(C ) Contains 
only simple terms, i.c.. variables or constants. Put (C) caii contain arbitrary terms. 
Given a cSa: isc' C\ we denote by At Pos[p,u(') the set of terms that appear in the ith 
position of occurrences of p in ( 

Theorem 6/7: Lit C hi a clause and. R. hi a tit of arguments, The. dim si (' is mdi- 
pnidnit ofR if tlx following condition holds for every trim) t in tin set ALHusi p, i,( ' ) 
for i n h / ( />, i) € R: 


At. -If t is not a rariabti , then tLdocs fioi appear in AVr/(r). 

•TA If t is a ranablt , then it has at most a single appro raiu'i in A 7 //(('*). 
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A3. If ' r is a raritibh appearing in .Tig(C) and appears also in a feign in position j 
of a predicate q in Pos(C), then (c/,j) £ R. 

*# 

The proof, is given in Appendix A. 7 It is easy to s<*e that if C is a clause for . 
which the set of arguments R , and Ri satisfy conditions A1.A2 and A3, then the set 
7vj U 7^2 will satisfy these conditions too. 

Given a set of formulas 'I' and a set of arguments R, we can simply check whether 
each of the formulas in is independent of R using the conditions of Theorem 0.7. 
However, a more interesting question is how we can automatically find the maximal 
set R of arguments such that each of the formulas in 'l' is independent of R, given a 
specific query r/. 8 We now describe an algorithm for finding such a set R. 

Given a clause C and an argument i of a predicate p that appears in C, two things 
may happen. The first is that there is no. set of arguments R such that (p.i) € R and 
C is independent of R. In this case we will say that {p.i) is needed .in C. Otherwise, 
we denote by PC(C\p,i) the minimal such set of arguments. Note that . PC(C.p.i) is 
unique since it can be determined by repeatedly applying condition A3. Furthermore, 
note that PC(C'.p.i) can be the singleton set containing 

Our algorithm starts out assuming that every argument of every predicate, except 
for the query predicate, is irrelevant to the query. It makes one pass over the clauses 
in and either removes arguments. from the list of irrelevant arguments, or adds 
preconditions for the inclusion of. other arguments in. the list. Finally, it removes 
from the irrelevant list any argument whose preconditions are not satisfied. The 
algorithm is shown in Figure 6.1. 

Consider the application of the algorithm to the rules in Example 6.1. with the 
query airlincFlight(S F. LA. I\ 2 ). The set R initially includes all tlie arguments of . 
route and flight. When considering the rule r 1 , the algorithm adds the argument 
(route, i) to the preconditions of (flight.}), for i = 1. ** , . 4. Considering rule r 2 . 
the algorithm removes the arguments (flight. 2), (flight .4). (route, 1) and (route. A) 
from R. As a consequence, the argument (flight, 1) is' removed from R because its 
precondition was removed. Finally, in considering rule 7 : 3 , the argument (route. 2) 
is removed from R because the argument (airlineFlight.'l) is not a member of R.. 
Therefore, the algorithm returns that the arguments ( flight .3) add (mitt. 3) are 
irrelevant to the given query. 

' Note that t hr conditions A 1 , A 2 mid A3 arc not necessary conditions for independence. However, 
a necessary condition is considerably niore elaborate and is not presented here. For example, the 
rule 

p(.v..v.n =*<?(• Y..v.n 

is independent of the set of arpuniefits {(;>. 1). (/>. 2). (<?, 1 ). (</. 2)}, hut condition A 2 is not satisfied 
"Note flint tlie arguments of <j are assumed to be relevant to the query 


140 


CHAPTER 6. IRRELEVANCE AND ABSTRACTIONS 


procedure find-irrelevant-argumentsfty, q) 

begin /* arc the clauses and q is the query predicate */ 

V — Tht 1 predicates appearing in ty, and. not in q. 

71 = {(/■>. i) | p € V and i is an arguihent of p }. 
for every .*>• € 71, Preconditions(s)= {}. 
for every C € 4* do: . 
for every (;). i) 6 72 

if p appears in C and ( p , i) is needed in C then 
remove [p, i ) from 72. 

else 

if PC'(C,p , i) <1 72 then remove ( p , 1 ) from 72. 

else Preconditions {(p t i)) = Preconditions {{p,i)) UPC(C,p,i) - {(p,.i)}. 

repeat 

if (/>,<) €-72 .and (q,j) € Preconditions {(/>, i)) and ( q,j ) £ 72 then 
remove (/),,/) from 72. 
until no changes are made to 72. 

return 72. 
end. 


Figure 0.1: Algorithm for finding irrelevant predicate arguments 

The algorithm finds. .the maximal set of predicate arguments that satisfies coiP 
ditions Al, A2 and A3. This follows from the observation that for every argument 
in the returned set. its precondition arguments are also in 72. Furthermore, every 
argument that was removed from 72 \vas cither needed, in some clause or required 
some other argument that is not a member of 72. 

To summarize this section, we have presented formal definitions of irrelevance of 
predicate arguments. As a result, we were able to develop an algorithm for automat- 
ically deriving such irrelevance claims. The formalization also gives us insight into 
the utility of removing predicate arguments. Finally, wo dan also devise algorithms 
for deriving logical conclusions front external irrelevance claimsi If we are told that - 
the arguments 72 are irrelevant to a query q, we remove from the knowledge base all 
the formulas that are not independent of 72. We then apply our algorithms to the 
remaining set of formulas to derive additional irrelevance claims. 

6.4 Irrelevance of Predicate Refinements 

A predicate refinement is a set of predicates of equal arity Q = {r/i that 
identifies some set of properties in the domain. For some queries, it is not necessary 
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to distinguish between properties q\ q,., Therefore, wo would like to replace them 

by one predicate q which is intended to denote the union of the relations represented 

bv q\ q n . The predicate q may already exist iii the knowledge base (this will 

be. the common case), or may be a newly introduced predicate. Replacing a set of 

predicates q x q h by a predicate q has been considered as the problem of predirat f 

abstraction [Plaisted, 1981: Tenenberg. 1990]. Our treatment, of predicate abstraction 
is inspired by the work of Tenenberg. As before, we denote the result of the syntactic 
transformation we want to perform to a formula d by /${&). In this case, jf c (</>) is 
the result of replacing every occurrence of a predicate q l in o by the predicate q. 

Our strategy is to identify formulas that are independent of Q. Independence will 
be used both for defining irrelevance and for deciding to which formulas t.t> apply Jq. 
As in the case of predicate arguments, the definition of independence is based on the 
intended semantics of the new predicate 7 ., Specifically, suppose I is an interpretation., 
for .the set of formulas in the KB in which a predicate symbol p is mapped to a relation . 
P. We define an abstract interpretation Abs{l) for formulas .in which occurrences of 
r/i.,. . . .q n are replaced. by the. predicate q. The interpretation Abs(I) will, have the 
same set of objects as 1. The relations in Abs[l) are (Rel{l) - { Q j , . . U Q . 

where Rcl(I) are the relations in /, and Q is a new relation . 9 The- interpretation 
Abs{ 1 ) is' defined as follows: 

• The interpretations of terms in / and Abs(I) are identical . — - . 

•-If V $ {< 71 . • • • >//n } , the predicate p is mapped -to the same relation as in /. 

• The predicate q is mapped. to the union of the interpretations of q± q n . i.e., 

Q = Q\0 ...UQn. . 

Based on the definition of abstract interpretations we define independence as follows: 

Definition 6.8: A formula c* is independent of the predicate refinement Q , with 
respect to the knowledge bate A. if for any interpretation / 

/ |= A =* Abs(l) [= / 0 (v). 

I 

We define the abstract knowledge base resulting from removing Q froth A by: 

.-iKsji(A) = {/v»( t-j | v € -A and i : > is independent of 0} 

'■'If t ho predicate ij already exists m (lie KB, then Q is the relation to wiiirh </ is mapped 
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Note that if./ is an interpretation, then the fallowing holds as well: 

/ 1= A — ^ Abs[I) j= .46 sq(A). (6.2) 

Returning to Example 6.2, \vc observe that the rule 
.s' sport sCar(X) =$ high RiskTolnsurc{X) 

is. not independent of the predicate refinement {sportsCar, familyCar} . To see this, 
consider the interpretation I in which fafnilyCar and car are mapped to the relation, 
containing { ( C eunry ) } , and both sport sCar and high RiskTo Insure are mapped to 
the empty relation. The rule /q[s) 

car. (A*) =t> high RiskT oInsure(X ) 

is. not satisfied by Abs(I) (which in. this case contains the-same interpretations foi* 
car and highRiskToInsure as /). 

As in the, case of irrelevance Of predicate arguments, Weak irrelevance provides a 
logical justification for abstracting A by Abs^>{ A): 

Theorem 6.9: Let X\,(A) be the set of derivations of the query 4' from the knowledge 
base A, . 7/ H7«2, V. A..0/,,X\.) holds, (hen 

E If Ah 4\ then A6 sq{&) h f$( v). 

2. If the formula Y does not. contain predicates from Q , then ,46s^(A) |= f$(k') => 
A }== w. 

The proof is given in Appendix A. . 

Automatically deriving Irrelevance, of Predicate Refinements 

As in the case of irrelevance of predicate arguments, our strategy in deriving irrele--. 
vance of predicate distinctions is to find a.set of formulas 'I' that are weakly irrelevant 
to the query and such that the formulas in A — 'I* are independent of the predicate 
refinement. To devise-such a method, we need to be able to verify that a formula is 
independent of the predicate refinement. Below we give a condition oil clauses that 
enables us to verify independence. 

Lemma 6.10: A clause C is independent of the predicate refinement Q w.r.t. the 
knowledge base A if and only if the. following condition holds. 

Suppose A eg{( )' is the result of substituting every occurrence of a predicate of Q 
in A cg(d') by some other predicate in Q using a mapping f\ (two. occurrences of the 
same predicate in A ’cg(C') need not be mapped to the sanie predicate under /| j. Then . 
there exists some C\ such that f,:{C 2 ) = PoS(f Q (C')) and A (= C 2 U Ni<j(C)'. w 

U] Oj U A rg[Cy cionolfs llio Hatlsr containing the union of litorals iti Cj and Srg(C)'. 
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The proof of the lemma is given in Appendix A. Note, that a clause that con- 
tains only positive literals from (2 "'ill be independent of the refinement whenever 
it is provable from A. In the case of Horn rules, the definition boils down, to the 
following condition. If r is a rule (whose head is not a predicate in (2). then it must 
be the case that given any mapping of occurrences of a predicates in Q in the an- 
tecedent of r to any other predicates in £, the resulting rule is still entailed by A. 11 
For instance, in Example 6.2. we can map the occurrence of familyCar in the rule 
favtUyCar(A) =$> vefucle{.\ ) to sportsCar and the resulting formula will .still be 
entailed by the knowledge base. Ground atomic formulas are independent of any 
predicate refinement. 

Finally, note that the condition given in Lemma 6.10 involves checking whether 
A \= r, U :Xcg(C)', and is therefore in general undecidable. A sufficient condition 
caii be guaranteed by checking whether C 2 U A ’eg(C)' € A. 

6.5 Discussion and Related Work 

This chapter proposes a new approach. to research on reasoning with multiple levels 
of abstraction. At its core, the approach advocates associating an abstraction of 
a knowledge base with the .removal of some irrelevant detail, and then using the 
framework for analyzing irrele vance. -t 0 gain insights, into the. specific abstraction at 
hand. Specifically, the approach provides the. following advantages: 

L The formal definition of irrelevance provides a logical account of the conditions 
under which the abstraction is appropriate. 

2. The problem of automatically creating an abstraction for specific queries (based 
on deriving irrelevance claims) is well defined. In many cases, we can adopt, 
existing algorithms for automatically deriving irrelevance claims in. order to 
create abstractions. 

3. The analysis of irrelevance provides insights into several properties of the ab= 
straction. such as the utility. and composibilitv of the abstraction. 

•1. Associating abstractions with irrelevance claims makes it possible to compose 
knowledge bases that each make certain abstractions (or simplifying assump- 
tions) about the domain. This caii be done by explicitly reasoning about the 
consistency and adequacy of the irrelevance claims underlying the a ssu mptions 
being uiade by the. knowledge bases. 

"If the head of r is a predicate iii Q then in addition to the mapping on the antecedent there 
must be sortie predicate t/o € Q such that replacing the head predicate by r/n still yields a rule that 
is entailed by A 
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We have demonstrated the approach for two kinds of abstractions, removal of 
predicate arguments and predicate abstraction. Ip both cases, we have provided a 
logical account for the appropriateness of the abstractions, and we. have developed 
efficient algorithms for automatically deciding which abstraction is appropriate for 
a given query. In the next chapter, we demonstrate how this approach can be used 
for automatically composing a knowledge base for a specific query from a set of 
knowledge base fragments that are given at multiple levels of abstraction. To pursue 
this approach, additional relevance subjects should be considered in detail, as well as 
e.xploring alternative definitions of irrelevance. 

It should be noted that the idea of associating irrelevance with abstractions was 
also mentioned by Subramanian [Subramanian, 1989], but was not formalized or 
demonstrated concretely, Subramanian also mentions some of the new irrelevance 
subjects descr' ed here. 

Our results on projecting predicate arguments are related to the work by Ra- 
makrishnan et al. (Ramakrishnan ct al n 1988] on.identifvingexistential queries. That. 
work presents an algorithm for detecting cases in which arguments of subgoals in 
logic programs can be removed without affecting the. answer, to the query. Their 
treatment of predicate arguments -differs in that. they distinguish between different 
Occurrences of a predicate in a program. . As a result, their algorithm may decide to 
project an argument of a. predicate p in one occurrence of p and not to project it in . a 
different occurrence, thereby requiring two versions of the relation denoted by p. We 
can refine our treatment in the same way by applying a syntactic transformation to 
our knowledge base in which we rename every occurrence of a predicate p such that 
no two occurrences in the original knowledge base have the same predicate name. 
The definition of independence that we present in Section 6.3 is. better motivated se- 
mantically than the one they present and applies to more than just Horn rules. The. 
syntactic condition for independence given in Theorem 6.7 generalizes the condition 
given in [Tamakrishnan ct ai s 1988] to arbitrary clauses. Finally, their algorithm for 
identifying irrelevant predicate arguments is based on building a rule^goal graph of 
the rules in the knowledge base, Our algorithms use the query-tree and can therefore 
detect a larger class of irrelevance claims by. considering interpreted literals in the 
rules, minimal derivations and extended languages including negated EDB Subgoals. 

Our treatment of predicate abstraction in Section 6.4 is inspired by the work 
of Tenenberg [Tenenberg. 1990]. Tenenberg considers the problem of finding the 
maximal set of clauses that are independent of a predicate refinement. He presents a . 
constructive proof for the existence of such a set and shows that unless the knowledge 
base A is empty, the set will be infinite. The abstract knowledge base we Consider 
Abs$( A) is a finite subset of Tenenberg’s maximal set. and is effectively computable 
in many cases. Tenenberg also considers a finite and computable subset that he 
calls McmbAbi s. However. MtnibAbs is a subset of A6.s^(A). The contribution of 
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our work on predicate refinements is in providing a logical justification for when a 
predicate abstraction is appropriate for a given query and providing algorithms for 
automatically verifying that the justification holds. 

Giunchiglia arid Walsh [Giunchiglia and Walsh, 1992] present a theory of abstrac- 
tion in which they identify two classes of abstractioris. The first, TD-abstraction$< 
requires that arty formula that is derivable from the abstract knowledge base must 
also be derivable from the original kriowledge base. The class of T I- abstract ions re- 
quires that any formula which is derivable from the original knowledge base also be 
derivable from the abstract one. One of the key aspects underlying our treatment of 
the connection between irrelevance and abstractions is the intuition that removing ir- 
relevarit knowledge should not result in the ability to derive conclusions that were not 
derivable earlier.. As a result, in the examples we presented, the abstractions justified 
by irrelevance claims fall under the class of TD-abstract.ions. Moreover, weak irrele^. 
vance also, guarantees that. if the query was derivable in the original kriowledge base, 
it will also be derivable in the abstract. knowledge base. Therefore, our abstractions 
Can be . viewed as being TTabstractions with respect to a specific query. 

Historically, Tl-abstractions have received more attention (e.g., ]Sacerdoti, 1974; 
Plaisted, 1981]). In that work, the intuition was that, in most cases the information 
removed was irrelevant to the query, and therefore the answer obtained from the 
abstract knowledge base would hold (or could be refined to an answer) in the original 
knowledge base. For example, ABST.RIPS [Sacerdoti, 1974] rnade the assumption 
that the action preconditions of lower criticality values are easier to achieve. and can 
therefore be ignored when formulating an abstract plan.. The utility of the abstraction 
depended on hoW often the problem solver would have to backtrack across abstraction 
levels. To articulate the intuition behind these kinds of abstractions by irrelevance 
claims, we need to extend our framework for reasoning about irrelevance to include 
probabilistic (or default) irrelevance claims. We need to be able to state that some 
Subject is irrelevant to a query with some, probability (or under some conditions). To 
understand these abstractions we also need to refine * he theory of Giunchiglia arid 
Walsh. Their theory is based on two kinds of relationships between the statements: 

51. A h q and 

52. Abs{ A) h Abs[q ). . 

where .46„s(A) and Abs(q) are the abstractioris of the knowledge base: and the query 
respectively. For TD-abstractions, they require 52 =* 51, while for TPabstractions 
they require 51 =$> 52. Instead of considering only -these two stribt relationships, we 
can consider Other possibilities. For example, we can require that if S2 holds, then 
tlicfe is some prespecified condition on q and the possible derivations of q such that 
Si holds. Such a condition Should be useful in tellirig us whether the answers given 
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fiom the abstract knowledge base would follow from the original knowledge base, or 
how to refine a derivation in the abstract KB into a derivation in the original KB. 

Our analysis of irrelevance of predicate arguments is one instance of this more 
general class. For example, if the first argument of a binary predicate p is irrelevant, 
to a query, and p(a) is derivable from A6*(A). then this guarantees that there exists 
some A such that p(X.a) is derivable from A. 

Knoblock s ALPINE system [Knoblock, 1990] is another example of this gener- 
alization. The ALPINE planner creates an abstraction hierarchy which guarantees 
that if there is a plan for a goal in the original problem space, then there will be 
one in the abstract space such that the original plan is a monotonic refinment of the 
abstract plan. This condition enables the planner to considerably prune its search 
when it refines an abstract Solution, since it need only consider monotonic refinements 
of the abstract plan. Knoblock et al. [Knoblock et a/., 199l], present other examples- 
of possible relationships, between abstract and. concrete plans which are then used to 
prune the search of a planner. 

Additional work on automatically generating abstractions is described in [Ellman, 
1990, Ellman, 1992, Lowry, 1992], Work on analysis of .the utility of abstractions is 
described in [Knoblock, 1990; Bacchus and Yang, 1992], 


Chapter 7 

Automated Modeling of Physical 
Systems 


The previous chapter described how relevance reasoning can play a key role, in facil- 
itating reasoning in complex domains that require extensive use of abstractions. An 
important domain with such characteristics is. that of modeling of physical systems. 
In this domain, we are given .a theory of the physical world, a. description of a specific 
system and a query about that system. Our goal is to.choose a representation for the 
system that will enable us to answer the query effectively. Physical systems can be 
represented in. multiple ways, using several levels of detail,. abstraction and differing 
perspectives. Therefore, the main challenge in solving this problem is choosing among 
alternative possible representations of the system. The chosen representation must 
be adequate for answering the query, but must also be as parsimonious as possible, 
in order to allow efficient inference. 

This chapter considers the automated modeling problem from the perspective 
of relevance reasoning. In doing so, we shed light on the problem, showing that 
certain aspects of it can be automated by simple considerations of relevance reasoning. 
Furthermore, we show how additional domain knowledge, which is needed for model 
selection, can be expressed in the form of irrelevance claims. We combine these 
observations into a novel model selection algorithm, based on relevance reasoning. 
Section 7.1 describes the model formulation problem and its relation to relevance 
reasoning. Section 7.2 describes our algorithm and Section 7.3 presents an analysis 
of. its properties. 

7.1 Problem Formulation 

In Order to reason about a physical system for tasks such as simulation, design or 
diagnosis, we need sonic representation of the System , We call such a representation 
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a model for the system. In this chapter, a model refer!; to a representation of a system. 
We use the phrase logical-mode l to refer to the concept of a .model in Mathematical 
Logic (cf, [Endefton, 1972]). 

For complex physical systems, there is typically no single model of the system that 
will be adequate and enable efficient inference for all possible queries. Consequently, 
the goal of the automated modeling problem is to find a model for a system th at is 
best suited for a specific query. 

7.1.1 Compositional Modeling 

We. construct a model for. a given physical System based on the T'om positional Mod- 
eling approach described. in [Falkenhaiuer and .Forbus. 199l].~ln this .approach, a 
physical situation is modeled as-a collection of Model fragments. Each model frag- 
ment. represents some atomic-aspect of a -physical object Or a physical phenomenon. 
For example, a model fragment rhay describe the-dependence of the voltage of a bat- 
tery on its charge level (as shown in Figure 7.1), or it may. describe the process of 
fluid flow, through a pipe connecting two containers. 

A model fragment contains .a. set of participants, which are the set of objects in 
tlie domain that are taking part in the phenomenon being described. An instantiated 
model fragment is a binding of each of the participants to an object in the domain. 
The model fragment contains a set of operating_conditiorJ which the participants need 
to satisfy in order for the instantiation to be valid. The behavior-conditions of the 
modebfragnlent specjfv the behavior of the participant objects in the phenomenon 
being modeled, 

A model for a. system in a given state is a set of instantiated model fragments 
whoSe operating conditions are satisfied, The union .of the behavior conditions of the 
instantiated model fragments gives rise to a simulation mode! for that state. The 
simulation model is used to-det ermine the next state of the system, in which a new 
simulation model is chosen. 

A model fragment consists of the following components: 1 

Participants: . These are the sot of objects participating in a mode! fragment . in- 
stance. A participant can be viewed as a unary function front .A model fragment 
instance, to the objects of the domain. In Figure 7.1. the participant is an 
instance of class battery. 

Variables: These arc' time dependent variables associated with. the participants in 
a model fragment instance, We distinguish .two kinds. of variables. The first, 
which are also called quantities, are variables that ate continuous over time (e.g.. 

1 For a complete formal discussion of model ff&Ritieiits.-see fFarqiiliar ei at , HHKi], Tim desefipt ion, 
below includes only U-ho aspefts role vain to otir discussion 
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Char ge-s ensitive- voltage ( X : battery ) 

Variables: 

voltage(X), chargeLevel(X) , d&naged(X) 
Operating conditions: 

-'damaged (X) 

6 < charge-level (X) < 30 
Modeling conditions:. 

Relevant (rechargeableBattery (X) ) A 
Relevant (chargeLcvel(X)) A 
Relevant (voltage (X) ) 

Behavior conditions:.. 

voltage (X) = f (chargeLevel(X)) 


Figure 7.1: An example model fragment 

voltage, current). The second kind are binary variables that may change over 
time (e.g., damagcd{bcittcry)<Ort(siL'iteh)). Binary variables are represented .by 
ground atomic literals. 2 In our example, the quantities are the voltage and 
charge level of the battery and damag.ai{ A’) is a binary variable. 

Operating conditions: These conditions specify -when ah instance of the model 
fragment. exists. They are conditions on the participants of .the model fragment 
and on its .variables, They include both structural constraints on the-partic- 
ipants as well as constraints on .the ranges of the variables. In our example, 
we require, that the battery not be damaged and that the charge level of the 
battery be between 0 and 30. 

Modeling conditions: These are conditions oil the model of the system that need 
to be satisfied in order for aii instance of the model fragment to exist. .They are 
used in order to distinguish .different ways of modeling the same phenomenon. 
VVe distinguish two classes of modeling conditibns. The first class consists of 
relevance claims. As explained in Chapter (i, relevance claims can be used to 
express the assumptions Underlying an abstraction. For example, a description 
of the battery that ignores its thermal aspects may be based oh the irrelevance 
eiaim stating that the predicate iniqii ratin'? is irrelevant to the query. We -as- 
sume that all the irrelevance claims used in the modeling conditions. are based oii 
a single definition of irrelevance. in our space, hi the modeling conditions we iise 


'Snlr t licit flu* tilite argument nf ;t|| variables is left nii pheil 
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the predicate relevant . which should he thought of as denoting the Complement 
of irrelevant. 1 he second class of modeling constraints includes assumptions 
about the problem solving task. These include assumptions about the desired 
accuracy of the answer and the temporal granularity of the model (e.g.. we 
will model a battery differently depending oil whether we ate considering its 
behavior over one second or over one year). 

In our discussion we assume the following convention about the interpreta- 
tion of predicates used in the modeling conditions. A positive literal of a 
predicate is assumed to denote an assumption that yields a more complicated 
model of a phenomenon. A negative literal denotes a Simplifying assumption. 
For example. r'relcvani{7'einperature{battery)) States that the representation 
is simplified to ignore the thermal aspect of the .battery, whereas 4he. literal 
rclevant{Tanp(raturc{batievy)) in the. modeling conditions states that, the 
model fragment considers the temperature aspect of the battery. As another ex-, 
ample, the literal 'irgi-(tiiiH Scale) states -that the representation is simplified 
to ignore longer term-effects on the battery. 3 

hi out. example., the- model. fragment requires that the charge level and the 
voltage are considered relevant properties, and that the- rechargeability aspect 
of the battery be relevant as well. Modeling. conditions are distinguished from 
operating conditions in that they are conditions about the model (i.e,, meta- 
level conditions)^ opposed to conditions 4an the domain and state.,. 

Behavior conditions: 1 he statements in (he behavior conditions are true whenever 
tlie instance -of the model fragment exists. Essentially, these sentences describe 
the phenomenon being modeled. We distinguish three kinds of behavior condi- 
tions. The first kind describe continuous phenomena (e.g., a fluid flow) by a set 
of equations involving the continuous quantities of the model fragment. The 
equations Play be quantitative (algebraic aiid ordinary differential equations) 
and can also be qualitative (e.g., the rate of evaporation negatively affects tlie 
amount of. water in the cup). The second kind of behavior conditions describe 
instantaneous changes of the binary variables of the model fragment (e.g,. turn- 
ing of 1 a switch), finally, the third kind of behavior conditions describe time 
independent properties of participants iii tlie model fragment. We assume that 
a model fragment contains behavior conditions of only one kind, hi Otir dis- 
cussioiL-we assume that the behavior conditions do not contain .inequalities on 

■'Some assumptions may he multivalued. For example, the tiiiie scale may. lie father small, medium- 
or large. 'I lie algorithms we describe in this chapter rah be extended iii a straightforward fashion 
(o deal with such lissumplioii.s However, for clarity we assume here Ilia! modeling assumptions are 
hiitary 
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quantities, The behavior conditions of the model fragment in our example de- 
scribe the functional relationship between the voltage and the charge level of a 
battery. 

The semantics of model fragments can be summarized as follows. Let f\ /„ 

be the participants of a model fragment M. Let o(.Y| Y„) be its operating con- 
ditions. n(.Y[ X h ) be the modeling conditions, and b{ A’j Y„ ) be its behavior 

conditions. First, whenever a set of objects satisfies the operating and modeling 
conditions, there exists an instance of the model fragment. Formally: 

V.Y, ,Y„ [ (o(A’i Y„) A d(.Y, .Y„)) (3 w)M(m) A A %jA’n) = -V, ' 

The existence of the model. fragment also implies that. the. variables mentioned in 
it are defined. Furthermore, the existence of the model fragment implies that the 
behavior conditions hold: 

V-Y, Xn [ (3 vi)M(m) A (Aj„ /,(m) = Xj) =* 6(.Y, Y„) ] . 

Composing.SimulatiQn Models 

Given a description of the .physical configuration of a system, . a -particular state it is 
in. and a query. about the state. ihe task is to formulate. a model that represents the 
physical phenomena occurring in the state. Such a representation is composed of a set 
of instantiated model fragments whose operating conditions and modeling conditions 
; are satisfied in that state. These model fragment instances are called l ho set of artirc 

model fragments in that state, and together will comprise the Simulation model for 
that state.. Tito behavior conditions of the model fragments in the simulation model 
give rise to a set of equations and logical formulas that must hold among participants 
and the variables as a consequence Of the phenomena taking place. They are used to 
determine the next state of the system in which a new simulation model is selected. 

The main advantage of compositional modeling that makes it appropriate for 
our task is its modularity. Writing model fragments, each describing a single phe- 
nomenon, is a much easier task than composing a complete model for evofy possible 
system and query. Adding model fragments to an existing library is also much easier. 
Furthermore, model .fragments can be reused in any appropriate Context . 

7.1.2 The Model Fragment Library 

To facilitate compositional modeling, we impose additional sifucturu on the model 
fragment library. Specifically, model fragments are grouped into rnmposiii tfindtl 
frit gin hit* t('MFs). and ('Ml's are further grouped into assumption classis. Before we 
discuss these constructs, we briefly describe the tint ion of causal ordering of quantities. 
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Causal Ordering 

Initiations iti model fragments describe the relationships among the continuous vari- 
ables involved in the modeled phenomenon. Those relations have no causal import. 
For example, the equation for Ohm's law. V = iR. only states the relationship be- 
tween the current, the voltage and the resistance, hi building a mode! for a System 
and explaining it, we bften want to know exactly how. the variables are determined, 
i.o.. what are the causal dependencies between the quantities in the model. For ex- 
ample. in a model containing Ohm's law. we may say that the voltage is determined 
by t lie current and the resistance. A causal ordering [hvasaki and Simon. 1986; 
de Kleer and Brown, 1986] specifies the dependency structure among the quantities 
in the model fragment. 4 It is specified by causally orienting every equation in the 
model fragment, i.o.. associating one quantity /(c) with every equation c in the model. 
1 lie quantity /.(< ) must be part of < . and.inttst not be associated with any other equa- 
tion in the model fragment (i.e., if f t # r, then /(e,) ^ /(c 2 )). The quantities in the 
model fragment that are not associated with any equation are called exogenous in the 
causal ordering /. The exogenous quantities are assumed to be determinedly other 
phenomena (described by other, model fragments), and. can therefore be considered 
as input— to the current model. fragment. Given a causal ordering /. we say that a 
quantity i’i causally affects a quantity r 2 if: 

• The quantity reappears ]n the equation <, and f_[() — r 2 . or 

• There exists-some quantity e 3 , such that e, causally affects f 3 and r 3 causally 
affects i-i. 

if v is a continuous variable, we say that a model fragment m can determine it 
if there is some causal ordering of the quantities in tit such that v is not exogenous, 
If v is a biliary variable, we say that it can be determined by m if it appears in its 
behavior conditions. 

I lie set of variables that caii be determined by a model fragment are called its 
output variables. We notv describe the structure's in the model fragnietit library. 

Composite Model Fragments 

Some model fragments describe the san'ie phenomenon. but differ only in their oper- 
ating regions, i.e.. the value ranges assumed for the continuous variables in the model 
fragment, for example, the function describing (lie dependence of flic voltage of a 
battery oil its charge level changes dep endin g oil the value of the charge level: 

chargeLevel.. 6 =* voltage ■ f i (chargeLevel) 


*\\f do not rotisiflct.cau.srtl (fcppudriirics rtihnfig binary variables licrd 
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6 < chargeL6v6l < 30 voltage = f 2(chargeLevel) 
chargeLevel > 30 =* voltage = f 3(chargeLevel) 

In. selecting a model that will be adequate for multiple states of a simulation, it is 
easier to think. of such model fragments as being grouped into a single composite 
model fragment (CMF). A CMF is a set of model fragments describing the entire- 
operating range of the variables participating in the phenomenon. In every state of 
the simulation, the operating conditions will guarantee that only one model fragment 
from every CMF will be included in the simulation model. Clearly, a CMF can also 
be a singleton set . 


Assumption Classes 


Composite model fragments are further grouped into assumption. classes.. 1 2 * * 5 An asr 
sumption class is a set of CMFs t hat describe the same phenomenon based on differ- 
ent and contradictory modeling conditions. As stated,, modeling conditions express 
the assumptions that, we are making in the. representation of the system. They ex- 
press the underlying abstractions and approximations that are assumed by the model 
fragment. Figure 7.2 shows an assumption class consisting of different ways of de- 
scribing the voltage of. the battery. One way to model the voltage is to assume it is 
constant. Another way is .to assume it degrades over time. More complicated ways, of 
modeling the battery consider aspects such as the. charge level and. the temperature. 
Since-CMFs in an assumption-class are contradictory, any consistent set of modeling 
assumptions will include at most otic .CMF from a single instantiated assumption 
class. 


CMFs i:i an assumption class arc partially ordered by a Simplicity relation, de- 
noted by the predicate <. A CMF c, is said to be simpler than a CMF c : if r, makes 
a superset of the simplifying assumptions made by c ; . The transitive closure of < will 
be denoted by the predicate <*. In the figure, the simplicity relation is denoted by 
the directed arcs. We assume that every assumption class has a single most compli- 
cated CMF and a single simplest CMF, The former represents the most detailed way 
of describing a phenomenon, while the latter represents the simplest way of doing so 
(e.g.. the voltage of the. battery can be modeled as constant). Filially, we assume that 


if 


r, < c, 


then: 


1, Tiic-output v;t> aides of c } are a superset of lilt* output variables of r,. 


2. if /, is a causal ordering of the variables of c,. then there exists a causal ordering 
fj of fj such that 4 he catisa! relations among variables in c, (given by /, ) tilt' a 

subset of the causal relations among variables in Cj (given by / ; ). 

5 Tiie term assiimplioii-rla.ss is used ui order to hr consistent Willi (Falkefiliainef and For bus, 

Ill'll), nnj- Iwntisr n is especially appropriate 
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Figure 7.2: Battery voltage assumption class 


All of these properties follow if we assume that whenever c,.< c ; , then e, is a causal 
approximation of c } [Nayak, 1992b]. Kayak has Shown. that causal approximations 
cover most approximation relations encountered in practice. 

In contrast to previous treatments of assumption classes, we assume that the mod-, 
eling conditions of CMFs in ah assumption class pftciscly characterize the differences- 
of assumptions made by CMFs in the assumption class. Specifically, this is formalized 
as follows. Suppose the modeling conditions of a C.MF c is the conjunct ion of the 
literals in the set As,, and suppose r, < c,, Then we can annotate the link front 

c, to Cj with a set of positive literals /), p h , which means that r, is making the- 

simplifying assumptions { . . . , ~'p >l } in addition to the simplifying assumptions 
made by c.j. or formally, 

\ il>\ Ph } ) u {-./ij 


1 he articulation of these differences will play an important role in selecting 
plest model. 


1 he si in- 
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Other Assumptions About the Library 

Composing non-contradictory fnodei fragments: A variable may be affected 
by more than one phenomenon and therefore by more than one assumption class. 
For example. the amount of \vale“ in a container can be affected by aii evaporation 
process and by a condensation process. These two phenomena are represented by 
different assumption classes. One of the key assumptions on model fragments in the 
compositional modeling approach is that they be composable. Specifically, this means 
the following. Let ??! t and m > be two model fragments that have consistent operating 
conditions and modeling conditions, such that both can determine a variable r. Then. 
nt] and m 2 can be composed to a single model fragment m 3 describing the union of 
the phenomena in nt] and in 2 . The. proced ure for creating m 3 is assumed. to be given. 

Coherence of the Library 

The library, coherence assumption essentially requires that if we have a set of model 
fragments that have consistent modeling assumptions and whose operating conditions 
are satisfied, then the resulting set of equations will not be overxonstrained (i.e., will 
not have more equations that quantities). Formally, this assumption is defined as 
follows: 

Definition. 7.1: A model fragment .library satisfies the library. coherence assumption 
if the following condition holds. Let M be any set of model fragments in the library 
and ,« be any state such that: 

1. The conjunction Of modeling conditions of model, fragments in A/, «(A/), are 
consistent. 

2. If a(M) (= Relevant {vi), then tq appears in some model fragment in M. 

3. The conjunction of the operating conditions of model fragments in M are sat- 
isfied in . 

Then, the set o r equations given by the union of the behavior conditions of M arc. 
not Over constrained. I 

Note that a set. of equations that are not over constrained can always be. made 
complete by assuming that some variables are exogenous. 

7.1.3 Cither Modeling Constraints 

Fxcopt for the modeling Conditions attached to each model fragment, we assume 
a background theory of modeling constraints. C. We use C to express additional 
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Constraints on tlic possible models. The constraints can either be domain independent 
( C, S-' general constraints entailed by relevance claims) or domain specific constraints. 
I ? °r example, the following constraint states that if the refinement of objects along 
the property v is relevant for an object o that is relevant to the query, and ?'(o, .V) 
holds, then X is relevant to the query; 6 

tclevantObject Re finement[0, R,G) A r(O..Y) A relevantObject[O.G ) => 

rdcvantObject{X . 6’). 

The constraints in C may also be heuristic in nature. For example, .the following 
constraints are a Variation oil the object expansion heuristic .used in [Falkenhainer and 
Forbus. 1991]. They are used to enforce the relevance of certain objects in a system, 
given the initial Set of relevant objects. 

structural H ia'xivchyS lot{p) A rclcvantObject(X<G)A 
rclcvan tObj cctRif in cm cti t ( X, ./ J . G) A p(.Y, Y) => relevantObjec.t{)\C ) 

structural HicrarckySloRp) A p(A\ Y) A i'elevantObject{X< G)A- 
r decant Objc ct{\\ G ) =t> rcl e van t Object Ref i n t m en t(X.p) 

The heuristic, states that if the objects .s, and s 2 are both relevant to the query, 
and t is theinleast common ancestor in .the structural hierarchy, then any object in 
the hierarchy that, is either in between t and s, (or between t and s 2 ). or a child oL 
such a object, will be considered relevant to the query.. 

Essentially, constraints can be expressed using arbitrary first Order formulas. For 
efficiency reasons, .we assume that the constraints in C are expressed using only Horn 
rules. In practice, Horn rules have been expressive enough for the modeling_con- 
straints we have encountered. 

7.1.4 The Model Formulation Problem 

Informally, the model formulation problem is to Choose a .simulation model (i.e.. a 
set of instantiated model fragments) that can answer a given query about a system 
in a specific state. However, a simulation of a system may go through many states, 
and we do not want to repeat the costly selection process at every state. Therefore, 
We pose the model formulation problem as selecting a small set of C'.MFs, called the 
scenario modi!. The scenario model has the property -that its modeling conditions are 
consistent , aiid that at every state, we can choose a simulation model from it easily. 

Formally, the model formulation problem is to choose a. scenario model, given the 
domain theory (i.e., model fragment library and background modeling constraints), 
a system description and a query, defined as follows; 

"Note that wi> use specific predicate names in order to make the type of the subject in the 
relevatire ciafni explicit 
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• System description: A set of facts about a physical system and its initial 
state. This description typically includes a set of individuals (i.e.. components 
of the system), their physical structure and the initial values of variables in the 
system. 

• Query: 

- A variable r (or list of variables) whose behavior we want to predict in the 
simulation of the system. 

- A list E of exogenous variables and terms (i.e.. ground atomic formulas). 
The elements in E are assumed to be given and arc outside the scope of 
the simulation for which we are constructing a scenario model. We can 
use E to cifcumscribe.the set of states for which We are. creating a scenario 
model (c.g.. we may construct a scenario modeLonly for states in which 
the battery is not damaged). 

- A list .In it of modeling constraints -that we want to enforce. Implicitly. 
Init h\c\udcLJiclcvant(v)J " 

A scenario model is a. set of instantiated CMF's whose modeling conditions are 
consistent. At .every state the system checks the operating conditions only of the 
C'MEis in the scenario model. The conditions of at most one model fragment, from 
each CMF-will be satisfied-in the state, and these model fragments will comprise the 
simulation model of the state. Wc denote the scenario model by S and the simulation 
model created from it in state ,s by S s . 

The resulting scenario model must satisfy several properties. .First, .it must be 
adequate for answering the query. This means that it must be coherent and sufficient 
as follows: 

Definition 7.2: A scenario. model 5 is adequate if 
C'l. There is a logical-model M for the background constraints C such that 

A. All the modeling conditions of CM IT in S arc satisfied in A/. 

B. If HdlcvuvtivO is satisfied in M and tq is a. variable, then some OMF in 
<5 includes r l; 

C2. For any state s of the simulation, the. equations arising in 5, can be made 
complrh (i.e., not. Over. constrained Or under constrained) by adding exogenous 
variables. Furthermore, the. equations in S 3 include the variable r, and v is not 
exogenous in the complete set (and therefore we can say -that S s deterfnines v). 

’ Nolo that these constraints rati also' he specified as part of C. However, it is often more natural 
to specify them as part,. of the query 
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In order for a scenario model to be useful, it should be as simple as possible: 

Definition 7.3: A scenario model is simpler than S z if there is a mapping o : 
<$i oii the CMFs of S\. such, that 

1. For every c € «S,. 6{c) is from the same instantiated assumption class as r. 

2. Either c = o(e) or c <* <p(r). 

I 


The model Selection problem is to find a scenario model that is adequate and such 
that there is no simpler adequate scenario model. 


7.1.5 Model Formulation as Relevance Reasoning 

I nt approach to the model formulation problem advocated in this chapter is based on 
the intuition that several aspects of the problem can be viewed as relevance reasoning. 
We explain Our view in this section. 

Intuitively, the mode! formulation problem. can be viewed as..a combination of 
. wo, subproblems. The first is to determine which phenomena (and therefore which 
\ai iables) are relevant to the query variable. The second problem is to determine the 
le\ el of detail at which to model the relevant phenomena. These problems are closelv 
related, because the decision to model a certain phenomenon .at a. greater level of 
detail, may require modeling additional phenomena. 

Selecting the Relevant Phenomena 

rhe first part of the model formulation problem is to decide which variables are 
relevant to the- query variable (and therefore, decide which phenomena should be 
modeled). Intuitively, a variable tt is relevant to the query variable e if U can causally 
influence i \ i.e,, either ( 1 ) there is some state of the system in which u causally affects 
v or (2) ii can cause a change in the state of the system (and therefore indirectly affect 
the \alue of ?’). Consequently, finding the relevant variables can be done by following 
the possible causal influences between variables. .The algorithm .that we describe in. 
this chaptei tiaces through all the possible causal influences on the query variable. 
Note that the intuition underlying this algorithm is similar to the intuition underlying 
the construction of the query-tree, where we represented all the possible derivations 
of tlie query. 
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Selecting the Level of Detail 

The second part of the model selection problem is determining the level of detail 
at which to model each phenomenon. This entails deciding which abstractions and 
approximations can be made in. modeling the system. As described in Chapter 6, 
knowledge underlying such decisions can be stated as relevance claims and better 
understood when stated in that form. In our algorithm, we bring relevance knowledge 
to bear in two ways: 

• We articulate the difference between CMFs in an assumption class by the model- . 
mg constraints, .expressed partially by relevance claims. Previous treatments of. 
assumption classes require that every CMF have a set of modeling constraints, 
but. they- do not require that the -constraints be related in any way (except 
for being mutually exclusive). Articulating the precise differences between the 
CMFs is a more principled method of building assumption classes and enables 
us to determine when to switch from one model fra gmen t to another. 

• Engineers have good general heuristics for selecting relevant detail in model- 
ing of physical systems. We use the modeling constraints C to express these 
heuristics_declaratively and reason with them. . 

Our modeling algorithm will use both kinds of this knowledge to select the simplest 
scenario model. 


Partial Knowledge about the Simulation. States 

Our algorithm selects a scenario model for a set of possible states of the system. 
Envisioning all the possible states that the system may reach beginning from the 
initial state is a very expensive operation [de Kleer and Brown, 1984], which we do not 
want to perform as part of the model formulation process. Therefore, our. algorithm 
selects the Scenario model based only on partial knowledge of the possible states. 
This knowledge is given implicitly by the set E of the variables that are assumed 
to be exogenous and the time invariant facts in the description of the system. The 
problem we face here is analogous' to the relevance reasoning problem considered in 
Chapter 3. in which we wanted to decide which ground formulas are irrelevant to the 
query without actually knowing the Contents of the database. In analogy to what 
we did there, this entails that we assume that the system can actually reach any 
state that is consistent with our partial knowledge. As in Chapter 3. any additional 
knowledge about the reachable states may enable us to select a simpler scenario model. 
Assuming partial knowledge about the world is a key aspect, in making relevance 
reasoning practical. 
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7.2 Model Formulation Algorithm 

Based on these observations, we now describe Our model formulation algorithm. In- 
formally. the algorithm follows all possible influences oil the query in order to find 
all the variables that can affect the query. For each such variable, the algorithm 
selects the simplest CMF that describes it Such that the set. of selected CMFs make 
consistent modeling assumptions. 

To finu ail the variables. that can affect the query, the algorithm begins by consid- 
ering all the assumption classes in which the query variable may be an output variable. 
Ftom each such .assumption class we select one CMF and recursively consider all the 
variables that can. affect the query through the chosen CMF. These include: 

1 — If mis a quantity, we include all the quantities that appear in the s ame equat ion 
with 

2. All the variables that_appear. in the operating conditions of the model fragment. 

The recursion bottoms out when we reach the exogenous variables given in E. 

To select a. CMF from an assumption class, we maintain a list. Ret, of modeling 
assumptions made thus far about the model. The list initially includes the assump- 
tions given in I nit (and in particular, the relevance of the querv). At every step, we 
choose the simplest C-MF that does not contradict the assumptions in ReL 

Adding a new CMP to the. scenario model may imply that we add additional 
assumptions to ReL and that we need to revise previous choices of CMFs. We perform 
adjustment steps (via the while-loop in select-scenario-model) until all the choices 
of CMFs are consistent. The details of .the algorithm are shown in Figure 7.3. Note 
that Pos(As c ) denotes the positive literals in the assumptions made bv a CMF c. The 
function DeductiveClosure(D) returns the set of ground atomic formulas derivable 
from D and the rules in C. In what follows, we illustrate the execution of the algorithm 
with an example. 

Example 

1 lie example is a simple circuit containing a solar array (SA1) and a rechargeable 
battery (BAl), shown in f igure 7.4.. Figure 7.5 shows the scenario description and 
Figure 7.6 shows the model fragments in the library. For each CMF in the domain 
theory, the CMF’s behavior conditions and the list of variables appearing in its operat- 
ing conditions are shown. The annotated assumption classes are shown in Figure t. 7. 
1 he query is Voltage(BAl), with a. list of exogenous variables which includes all the 
variables mentioned in the scenario description except Damaged (BAl), The set of 
modeling constraints is empty. 
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procedure select-scenario-model(v. E. Init. C) 
begin 

<?={«*}• 

Rcl = I nil. 

Model - nil. 
repeat 

q = dequeue(£?). 

As = assumption classes in which q can be aii output variable and 
whose operating conditions do not contradict E. 
for each a £ .4# do: 

select-from-assumption-class (a, q). 

while there is a pair (c.q') £ Model such that ->p £ As c and p £ Rcl 
remove {c.q') front Model.. 
select-froni-assuniption-class (.4c, q'). 

/* .4c is the assumption class from which e Was Chosen */ 
until Q is empty., 
return the set {c.| ( e.q ') £ Model}. 
end select-scenario-model. 

procedure select-from-assumption-class ( A.q ) 

/*■ A is an instantiated assumption class determining q. */ 

begin 

c = The simplest. CM F in .4 such that jB p (— >;j € .4s c Ap£ Rel). 

Model — Model U {(c,?)}. 

Rel = -Deductive Closure^? U Rel U Pos{As c )). 
inputs = the union of: 

The quantities that appear in equations with q and 
The terms in the operating conditions Of c. 
for every X £ inputs do 

if-.Y has not been in Q and X £ E then 
enqueue A' onto Q. 

if relevant(qi) £ Rel and q\ g E and qi has not been in Q then 
enqueue q\ onto Q. 
end select-from-assumption-class. 

Figure 7.3: Model selection algorithm. 

I he query variable Voltage(BAl) is the only item on the queue initially, and so 
we identify Battery-voltage-ac(BAl) as an assumption class that can affect-it. 8 To 
select a CMF out of this assumption class, we start front the simplest. Constant-vol- 
tage-CMF. Since there are no earlier modeling assumptions, this choice is consistent , 
and we select this CMF. This results in addition of the following to cr modeling 

w \\c assume that there is a data structure that enables us to efficiently find the assumption classes 
that affect a given variable without searching the whole model fragment library. 
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Figure 7 A; An example circuit. SA1 is a solar array and BAl'is a rechargeable battery. 


S cenario Description 
So!ar*array(SAl ) 
Baitcry(BA 1 ) 
HcrhargeabU’tBAl) 
Blus-terminalfBA 1 )=t*l 
Minus- trrmina.IfBA I ) = t -i 
Pln.s-tcrttlinal(SA 1 )=(.2 
Minus-tcrnhnalfS/ 1 )=t 1 
Fleet ficallv-con iuvt cd (.t ‘2 ,t A ) 
Fleet r.ically-eontH’rlpd(t 1 .(!}) 
— >Ort in rigc*d ( li A 1 ). 


Legend 

CL: Cliargo-lc.vd(X) 

S’: \’ol t age- prod uced ( X ) 

TFM P: Tcniporatu rc-of( X ) 

1: Current (Plus-tcnliinal(X)) 

B-0 1): Average-depUi-of-discharge(X) 
T.SI.C. Timc-sltKT'l;\st-condi!.!oning(X) 


Figure •7.5: 


The initial st at r of the .system 


assumption list, Rcl: 

Relevant (Battery (BA 1 )) and Relevant (Damaged (BAD). 

Since the variable Voitage(BAl) can be imluenced by the variable Damaged(BAD 
(through the ( Ml: Constant-voltage-CMF (BAD ) which is not exogenous- the vari- 
able Damaged (BA1) is placed on the. queue and becomes the new current goal. We 
Ibid the assumption class Batt§ty-damage-due“to-oveircharg6-ac that, can affect 
Damaged (R aD. out -of which Battery -damage-CMF is selected since it is the-ouly 
member. This selection causes the literals: 

Relevant (Recharg6able(BAl.) ) aiul 
Relevant (Change- leveil (BAl)) 

to In' added to ILL However, this makes the assumption list inconsistent since 
both - , Relevant-(Rechargbabl§(BAD ) and -’Re Levant (Charge- level (BAl) ) were 
assumed hy Constant-voitage-CMF. 
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P.MF 

Behavior 

Variables in Operating Coiid, 

battery- voltage assumption class: 

Constant -voltage-CMF 

Binary- voltage-CMF 

Nornial-degrading-CM F 
Charge-sensitive-CMF 

Temporal u ro-seilsitive-CMF 

t = Co 

_ f 0 if CL < e„ 

1 n if CL > <?o 
C = /(Time) 

V = f(CL) 

F = {[TEMP. CL). _ 

Bat t cry ( X ) . ->Damaged(X) 

Battery(X). -%Dainaged(X) 

Bat t Cry (X). ->Damaged(X) 
BAUefy(X), ->Damaged(X),. 

Rocliargeablo(X) 
Battcry(X), ->l)amaged(X). . 
Rechargeable! X) 

Battory-diarg<Mevel assumption cla^s; 

Constant-charge-level-.CMF 

Normal-acrui'milation-CMI' 

A ccufutilat ion-wit h-agitig-CM F 

CL = c, 

CI. = Idi 

('L - Idt - {[DOD.TSLC) 

Battery(X). TvDatftaged(X) . 
Battery(X). -iDaulagCd(X). 

Rechargeable! X). 

Bat t Cry ( X ) , -'Damaged ( X ) , 
Rechargeable! X) 

BaUCry-damaged-due-to-overcharge assumption class: 

B at t erv-d a mag<*-C M F 

Dumagcd(X) 

Batterv(X), -’Damaged (X).._ 
Rechargeable(X) C L ( X ) . 


Figure ZXi: Scenario description aild model fragments 


Battery-chargp-levcl-jc 

CConst ant-charse-UW^I-CMF^ ) 

i RelcVant(I) 

Rcleviht(RcchargOible-baUet)'(X)! 

( jdormal-actutnuUtiori-CMF 


Rclevjmt(DOD), 

Releva'nt(TSLC) 


CAccuirVulalion-with-agirig-CMF ^ 


BaU^rV-voitage-jc 

( consUnl-VoltageCMF ) 

HeteVanllClT / 7 | Urge 1 TPOG 1 

(BihaiS'-voltageCMF) (^°nnal-d f grading.CMF) 

\ [ RtltVanliRcchargtabltlXII 

Small (Granularity) 'V 


y Rclevant(CL) 
C Charge-atreilive-CMFj ) 
^RcltVant(TEMF) 

^Ttmptraturf^fniilivfOM^) 


Figure 7.7: Assumption (lasses 


lo resolve the inconsistency, we adjust the choice of Const ant -voltage-CMF. and 
we now select Charge-sensitive-CMF. which i.s-Uie simplest (IMF that does not 
contradict the current, modeling assumptions. 

1 he current goal variable now becomes Charge-level(BAi). The assumption 
class that can alfeCt this variable is Battery-charge-level-ac. and We select from 
it the (IMF Normal-accumulation-CMF(BAi). which is tile siiiiplest.C’MK that is con- 
sistent with the current modeling assumptions. Current (Plus-termirial (BA1) ) can 
iniliieuce Charge-level(BAl) through this (’Mb -Ilowevei , since it is an exogenous 
variable, it is not placed on the queue. The queue is now empty and the procedure 
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terminates. The resulting scenario model contains: 

Char'ge-sensitive-CMF (BA1) * 

Battery-damage-due-to-ovefcharge^-CMFfBAl), 

Normal- accumul at ion -CMF(BAl). 

and makes the following modeling assumptions: 

Rel evant (Battery (BA 1 )) , Relevant (Damaged (BA 1 ) ) , 

Relevant (Rechargeable (BA1 ) ) , Relevant (Charge-level (BA1) ) , 

Large(TPOG) , Small (Granularity) , 

->Relevant (Temperature (BA 1 ) , -'Relevant (DOD) , -'Relevant (TSLC) 

Note that the procedure. terminates at this point in the example because the variable 
Current (Plus-terminal (.BA1) ) was specified as exogenous in. the query. Had it not 
been specified exogenous, the procedure would have added more CMFs to t.he model, 
including those representing othcuixoniponents and processes affecting the current. 

7.3 Analysis 

In this section we prove that. our algorithm produces the- simplest adequate scenario 
model for answering the query. We. discuss the assumptions under which this re- 
sult holds and discuss the consequences of relaxing, them. The following theorem 
establishes the main properties of the algorithm: 

Theorem 7.4: Let .Vf be a library of model fragments describing (hi domain, and C 
b( a set of modi ling constraints. Let S be a description 'of a system and ( t\ E, l nil) 
bi a query about the SySiti'n. Let S bt lid scenario modd resulting from eilgdrilhni 
select-sceriarioMiiodel. Eur.thcrmbh . assume that: 

• Thi library cohere net assumption holds. 

• If c, and c_, art two CMFs m tin assumption class, such that c, < c, . tin n r, is 
a causal (ipproiimaiion of hj. 

• All modi ling constraints u C ah Cither ground atomic formulas or Horn cults. 

• 77/ f most complicated sctiiario model, dfjihtd to be all Hit possible instant iaf tons 
of ( Mi s that ah tin most complicult d in Hitir assumption class, is adiijuatt 
for answering thi query}’ 

'Note that tile most complicated scenario model needs to include only the redid instantiations of 
model fragments, i.e., instantiation- in whirl) the objects satisfy (lie type conditions in the definition 
of tlir model fragment and the time invariant farts in the description of the system. 
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'rh.cn, S is an adequate scenario model for (r. E, l nit ) and there is nt> sernjjmo 
model Si su'ch that S j is simpler than S'. 

PrOof: First we note that because of the last assumption in the Statement of the 
theorem, the algorithm (i.e.. the while loop in select-scenario-model) must termi- 
nate. This is because .we fan always adjust a choice of a OMF to a more complicated 
CMF that does not contradict the assumptions in Rel. Ultimately, we will end up 
with the most complicated scenario model which is guaranteed to be adequate. 

Wo first consider the adequacy of S. Condition (T of adequacy requires that 
there exist a logical-model M of the modeling constraints C in which all the modeling 
assumptions of the CMFs in 5 are sat isfied (condition A) and such that any variable..!' 
for which Rile rant(i') is satisfied in M .is included iiuone of the CMFs in «$'. (condition 
B). We define M to bo-lhe logical-model which satisfies the positive. literals in .RcL 
and the negation of positive literals hot appearing in Red. M is a logical-model of 
C because Rel UC is closed under deduction. Condition A is satisfied because the 
following holds in the algorithm: 

(1) For every-CMF r € 5. Pos(As,) C Rel. 

(2) Whenever some simplifying assumption of a CMF e is not satisfied in Rel (i.e.. 
->p € As c but p € Rel), we adjust'the choice„olc. 

Because of (1). all the positive literals in c are satisfied in M. Because -of.fli). all the 
negative literals of c are satisfied in M , 

Condition B is satisfied by A/ because whenever R(lrcaut[r\ ) (E Re L where !q is 
a variable, then either 

• i*i € R. or 

• iq. was put on the queue, and some assumption class that can affect ?q was 
subsequently added to 5. 

Therefore, in both cases. 5 will contain a CMF that includes »q. 

'Id complete the proof of adequacy we need to show that condition C2 is satisfied, 
i.e.. that every resulting simulation mode! S, for a state s can be made complete by 
adding exogenous variables and that every such model determines t he query variable 
r. 

In building the scenario model we considered all the assumption classes that can 
affect !’. The operating conditions of a CMF from at least one of these .assumption 
classes must be in since otherwise that would imply .that there is no fiiodel.J’oi* 
tlie system. Therefore 61 includes the variable e. The library coherence assumption 
guarantees that the set of equations in S*. which We denote by R(j } . is not over 
const rained. We need to show that there is a choice of exogenous variables which 
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d&es not include v that will make the equations Eq$ complete. If Eq, includes w 
equations and / > iv variables. we need to choose / — m exogenous variables. We cun 
assume that Eq, does not contain a complete’ subset of equations. If it docs, th'fe 
are two cases: 

1. It contains a complete subset h q x that includes v. In this case, we simple 
consider any choice of variables that makes Eq^ — Eq\ complete. Since r is 
determined by Some variable in Eq x . the choice for Eq, - Eq x will suffice. 10 

1. It contains a complete subset Eq\ that does not include i\ In this case, we solve 
the equations in Eq x and consider the equations Eq-i resulting from replacing 
the appearances of vL-iables from Eq x in Eq, — Eq \ .by their solution values, 11 
1 he result ing set of equations will not be over constrained (because we reduced 
the number of equ at io ns-aiid the number .of variables by the same number). 

Let -rbe a variable in Eq s . Wc* show that set Eq - Eq i 4 U Ex(v,) is not. over 
constrained. 1 * Suppose, to the contrary, that it, is over constrained. There would 
then be a -subset of EqX hat contains more equations than variables. That subset of 
hq must .contain the equation Ex{i\), because, otherwise Eqi. would have been over - 
constrained, hurtherntore. that subset must contain. t\ in some other equation as 
well. .Now consider the set of equations Eq - E.v(i\). That set Contains the same 
number of variables as hi Eq, with one less equation. —Therefore .Tit must either be 
over constrained, contradicting the assumption that Eq$ is.not over constrained, or be., 
complete, contradicting the assumption that Eq, does not contain a complete subset 
of equations. Therefore, we can choose a variable iii Eq$ which is not c and make . 
it exogenous, afid the Set of equations will either be complete (iii which case wc are 
done) or will still be under constrained (in. which case, we choose another variable). 
After choosing / - w variables, .the equations will be complete. Consequent Iv. C2 
holds. 



hinally. we need to show that the model is as. simple as possible. In this proof it is 
important to remember that we have only partial knowledge about the possible states 
that the system may reach. Specifically, alt we know is t hat -t he- time 4 -independent 
Irfcts given in the system description must hold and that the value of binary variables 
given in E cannot change. 

In the proof, we assume that S was constructed by adding the ('Nil’s r, from 
assumption class fi , at the fill iteration of seleet-from-assu mption-claSs. Note that 


l0 NtH«* tli.it it i- belongs to a singleton sot of complete equations, thou this menits-thal we have 
a model fragment that is modeling r ;is roiistanl. Shire the modeling conditions in the slate are 
consistent, this is an adequate simulatioi'i model. 

'Note that when solving a sel of-qiial it alive equations, the. solution tiiay rofitaiti some anibigmU 
(i c . several jiossihle solutions), we and heed to consider each solution in turii. 

' 'The equation /•„\r(r I ) denotes that the Variable v, is exogenous. 
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souk' bf the c,'S may have been removed subsequently by choosing a more Complicated 
CMF fr 6 m the same assumption class. We prove the following by induction on it 

Al. There must be a CMF in tSLfrom the assumption class a,. 

A2. The C.M.F c, is the simplest CMF that can be chosen front a,, w.r.t. C. 

All. For eacli variable iq on the. queue. wC must include all the phenomena that can 
affect ft and can occur in one of the possible states of the system. 

Conditions Al and Ail guarantee that all the phenomena modeled in S are necessary.. 
A2 guarantees that all these phenomena are modeled in the simplest way possible 
with respect to the modeling constraints, C. The simplicity of 5 follows from these 
claims. 

The base case includes all the assumption classes that can affect the query variable 
r. Clearly, Al is satisfied because we need ail assumption claSs- that Can determine v. 
and the ones that .Were chosen were, those that are consistent with the possible- states- 
of the system. SmCe-select-from-assumption-class selects the simplest model frag- 
ments in tiieHe assumption classes that do not contradict RcL condition A 2 is satisfied. 
Condition A3 is -satisfied because if. a Variable tq appears with r in the. same equa- 
tion. then tq caii causally influence v. If fj is not exogenous, then any phenomenon 
that can influence tq liiust lie included in the model. Similarly, if. tq appears in the., 
operating conditions of a CMF that Caii determine v and is not exogenous, then any 
phenomenon that can affect tq must be included in. the model. 

We assunie the claims for i and we prove them, for i + 1. The CMF c,+t Could 
have been added. in two ways. In the first, we use the outer loop (i.e., adding a 
new assumption class when popping a variable from the queue). By tlie inductive 
assumption, we must include all the phenomena that cart affect the vatiableoii the t op 
of the queue. Therefore, adding CMF from « t+! is necessary, and so Al is satisfied. 
As before'. A2 and A3 are' satisfied because select-from-assumption-class selects 
the simplest CMF e that satisfies tlie assumptions made so far ai'ld adds only the 
necessary variable'** to the queue. 

The second possibility for adding t‘,+i is by the* inner, loop (i.e.. by adjusting a 
previous choice 1 from an Assumption class). In this case, tlie 1 inclusion of a CMF from 
<(,+i was justified by a previous CMF added tei $. Si lire *- 1 he* modeling assumptions in. 
//< / include only those t hat. are ' 1 entaileHl by C and previous modeling assumptions, they 
are .therefore the minimal set of assumptions, and since e, 4 . 1 . is the simplest (’.Nil 1 ' front. 
«,+.) that can be inchide’d in the scenario model.. A2 is therefore', satisfied. 1 '. 1 .. Finally, 
the variable's that were' put. on the queue when c,+i is put- in 5 are necessary using 
tlie-same' argument as before, Moreover, any variable that is already oh tlie'’ queue 

'•'Note that if r,+ i was put ti'i inslt'acl of c ■ for i < 1 4- 1 . thru <- ; was fcfuovrei from .S’ 
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docs not. have to be removed, because the C’MF r 1T _, is replacing a C'MF c, that is 
a causal approximation of c t+( , and therefore, any causal influence that was possible 
through Cj will be possible through c (+1 . | 

The following theorem shows that S. is built in. time that is polynomial in the size 
of the problem: 

Theorem 7.5: Let d be the manmtifn number of CAIFs in an assumption class arid 
let n be the number of instantiated assumption classes in S . Let / be the sum of the 
size of C arid the nurriber of ground atoms that appear in the instantiated modeling 
conditions of CAIFs in S. The running time of finding <S, is polynomial in n, d and l. 

Proof: Since the -modeling constraints are Horn, computing the logical closure of the 
set of modeling assumptions is done in time polynomial in /. .This is done every time 
we call the procedure select-from-assumption-class.. The number of times this 
procedure is called is at most fid. This tan be seen by. observing that every call to 
select-from-assumption-class may. worst, replace a CMF by. another one that 
is more complicated than it. Since there are n instantiated assumption classes in S 
and at most d CMFs-per class, this can only be done nd times. Consequently, the 
overall running time of the algorithm is polynomial in n.d and /_| 

7.3.L. Relaxing the Assumptions 

In this section we discuss the effect, of relaxing some of. the assumptions made in 
Theorem 7.4. 

The Library Coherence Assumption 

The most significant assumption that we made is the library coherence assumption. 
Although the assumption may seem a bit strong, there is a compelling argument for . 
it. Specifically, if the assumption does not hold, this indicates a problem with the 
model fragment library. If we have a set of model fragments that satisfy the modeling 
constraints but give rise to an over constrained set of equations, this is an undesirable 
feature of the library, that calls for additional knowledge acquisition. It should be 
noted tliat the library coherence assumption is made implicitly in [Falkehhainer and 
!• orbits. 1991]. In fact, if we assume (as in Qualitative Process Theory [Forbus, 
1984]) that all equations are uniquelv’causaliy oriented, then the library coherence- 
assumption follows when we make the. causal approximations assumption and the 
assumption tliat the most complicated scenario model is adequate. 

YVe can relax the library coherence assumption at the cost of doing more work at 
every state of the simulation. Specifically, in the absence of the coherence assump- 
tion. the scenario model created by our algorithm is guaranteed to produce a set of 
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equations at every state from which a complete model can be extracted (perhaps by 
removing some equations). We can extract the complete model efficiently using the 
methods described in [Xayak. 1992a]. 

Causal Approximations and Horn Restriction 

The Only role of the causal approximations assumption and the restriction that the 
modeling constraints must be Horn is to guarantee efficient performance. of the model 
formulation algorithm. The causal approximations assumption guarantees that when 
we select a more complicated CMF in an assumption class, the set of simplifying 
modeling assumption. decreases (i.e., more positive literals are added to Rd). The. 
Horit restriction guarantees that once a positive literal has been put in -Rd, it will 
not he retracted. Relaxing either of these two. assumptions will require the algorithni 
to perform arbitrary backtracking and Constraint satisfaction. . As shown in [Navak, 
1992a], this will cause the model Selection problem to be intractable. 


7.4 Related Work 

Several researchers have. considered the problem of model formulation. Their work 
addresses one or both of the two aspects of the model .formulation problem, namely 
model construction and model simplification. . 

Xayak (Xavak, 1992a] addressed both aspects. Nayak describes an algorithm for 
constructing a model for. the single state case. . His algorithm also follows possible 
causal influences; however, these influences must be given explicitly using the compo- 
nent interaction heuristic. In contrast, our Work exploits the structure of the model 
fragments to derive these links, thereby not burdening the user With the error prone 
task of putting them in. It should be noted, however, that user intervention, as in 
Nayak's scheme, can enable a further focusing of the search by inserting only a subset 
of the links. 

In choosing a model fragment. from every assurtiptiofi .class, .Nayak chooses the 
most complicated one, and then uses a procedure to simplify the resulting model, 
Our algorithm builds the model by selecting the simplest CMF possible in every class 
and oiiiy adjusts the choice if necessary. In cases where the CMFs in an assumption 
class vary significantly in their. Complexity, our approach leads to substantial savings 
in the search, since wc only introduce the Complicated models if. necessary. It should 
be emphasized that the more complicated CMFs. will involve more variables that Will 
be put on the stack and will therefore result in a much larger scenario model, Finally, 
it should be noted that Nayak's methods fof model simplification can be applied to 
the simulation model generated at every state from our scenario model. 
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Falkenhainer and Forbus' work on compositional modeling [Falkenhainer arid For* 
bus. 1991] describes the representation aspects of compositional modeling and ad* 
dresses the model construction problem. In their framework, every model fragment 
has a set of relevance conditions corresponding to our .modeling conditions.. Our use 
oi relevance claims enriches their language (specifically their Consider predicate) and 
provides it with a formal basis. In their mode! formulation algorithm, they first select 
the physical scope of the model (by identifying the lowest, object down the partonomic 
hierarchy that subsumes all the objects mentioned in the query) and .then select the 
relevant properties of these objects. They rely on heuristics to select types of prop- 
erties to be modeled. This approach can easily lead to inclusion of, model fragments — 
that are not causal!y_related .to the query, and it cannot guarantee the sufficiency of 
the model produced. Our algorithm provides more flexibility in. that the. selection of 
the physical scope of the scenario model .and the selection of the relevant properties 
are done in .a uniform way, (by reasoning about the modeling Constraints) and can 
therefore affect each other. Furthermore, we only select properties to model that can 
casually influence the query. Finally, to. select, the. simplest model, they generate, all 
possible consistent sets of modeling assumptions and choose the simplest based 6n r a_ 
very .informal criteria of simplicity.. Our selection of the Simplest. model is based on 
explicit, representation of.the differences between model fragments and On reasoning 
with formulas expressing these differences. 

Rickel.and. Porter's work on model formulation [Rickel and Porter, 1992] is similar 
to ours since it makes use of graphs of interaction paths among variables to select _. 
relevant model fragments. Their graph of interactions is less general than the causal 
influence graph Created by our algorithm, since it only includes variables, while we. 
include.all terms (including variables, predicates and relations) that Could directly or 
indirectly influence the goal terms, Their approach also does not provide guarantees 
of sufficiency or simplicity. 

1 he idea of explicitly representing the differences between CMFsin an assumption 
class is similar to the graph. of models by Addanki et al, [Addanki tt ai , 1989]. Their 
work addresses the problem of selecting among complete models. Since the models in 
their graph are complete models instead of fragments, the space requirement of their 
approach increases exponentially as the number of possible modeling assumptions 
increases. Our approach can .be viewed as combining the idea of a graph of models 
with compositional modeling. 

The model simplification problem has been addressed by Williams [Williams, 

1 990a] and Weld [Weld, 1990j. Williams also makes use of. causal influence graphs to 
simplify a model. Both Weld and Williams assume a complete model of the situation 
as an input. Williams also makes use of the idea of following causal influences in his 
work on innovative design [Williams. 1990b]. 
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7.5 Summary and Contributions 

This chapter described an application of relevance reasoning to the domain of mod- 
eling physical systems. Aside from showing that relevance reasoning is a viable ap- 
proach to solving the model formulation problem, we have also shown that the mod- 
eling problem can significantly benefit from being considered from the perspective 
of relevance reasoning. Specifically, we have shown that some aspects' of the model- 
ing problem can be approached using general considerations of relevance reasoning, 
(he., backward chaining on causal influences and articulating the differences between 
C’MFs in. assumption classes). Moreover, we have shown how to incorporate engineer- . 
ing knowledge and heuristics, for modeling in a declarative fashion, .using relevance 
reasoning. The ability to declaratively express -modeling heuristics has„several advan- 
tages. Since it is easier to inspect and modify 'declarative knowledge, experimenting 
with different, modeling. heuristics becomes viable. In contrast, other methods wire 
in their .modeling heuristics, and therefore modifying them requires, rewriting code. - 
The. result of Our approach was a novel model formulation algorithm which efficiently;.., 
Selects the simplest model for a systerruand a query, An important aspect of our 
algorithm is that it chooses-a model for a simulation of the system without knowing 
precisely which states the system can reach. 

The .algorithm has. been implemented as part of a system called Device. Modeling 
Environment (DME) [Iwasaki and Low, 1992], which is a device modeling program to 
provide a computational environment for design of electromechanical devices. Given 
a topological description of a device, DME formulates a behavior model of the device 
using the compositional modeling approach and simulates its behavior. Prior .to 
implementing our algorithm, the system would prompt the user to select a set of. 
model fragments to be considered in the scenario model, thus creating a significant 
knowledge acquisition bottleneck. DME checks the operating conditions of every 
model fragment in the scenario model to determine the simulation model for each 
state. The system works on several examples, including the electrical power system 
of an earth orbiting satellite, of which the example used in this chapter is _a_much 
simplified version. 

Research on compositional modeling is in its infancy. The discussion in this chap- 
ter contributes by crystaliziiigsome of the main questions regarding the approach that 
requite additional research. The key issues. that came to bear in this chapter are (1) 
how .to .write model fragments (i.e.. how to decide what phenomena can be described 
iii a single .model. fragment, and .what assumptions. to make regarding the contents Of 
a model fragment), (2) liow to organize model fragments.in a library and (3) what 
assumptions can be made about .the model fragment library. We have contributed, to 
solving problem (2) by suggesting the concept of compositional model fragments and 
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by requiring explicit representation of the differences between C.MFs in an assump i 
tion class. In our discussion we made several assumptions regarding questions (1) and 
(3j. In general, we see a tradeoff between (I) and (3). If more assumptions are made 
about individual model fragments, then fewer constraints need to be placed on the 
model library as a whole, and vice versa, Finding the optimal point in the spectrum 
of possibilities requires additional research and practical experience building systems. 
We believe that imposing some structure on the model fragment library is necessary, 
and beneficial in the long run to facilitate knowledge acquisition and reuse. 

Finally, as mentioned in Chapter 6. the problem of model formulation can be 
viewed as one instance of a problem solving setting in which a system. needs to reason 
about its own .knowledge before answering a query. In doing so, it must choose 
among alternative representations of the domain that make different assumptions and 
abstractions. Other instances of t his problem are also currently under investigation, 
such aS reasoning with contexts and query evaluation in heterogeneous databases. We 
believe that the techniques developed in this chapter can form the basis for reasoning 
mechanisms in these other problem solving tasks. 


Chapter 8 
Conclusions 


The ability to automatically identify and ignore irrelevant information is a key to pro? 
viding efficient. inferences from. large knowledge based systems and for a system to be 
able to create appropriate abstractions in a complex domain. The main contribution 
of this dissertation is showing that it is possible to reason effectively about relevance 
of knowledge in a principled manner and that such, reasoning can significantly impact 
the preformance of knowledge based systems. This chapter, begins by summarizing 
the specific contributions of this dissertation. We.theii present a tabular summary of 
the rriain references to related work. Detailed discussion of related work is scattered 
at .the relevant points throughout the dissertation. Finally, section 8.2 concludes with 
a description of directions for future research. 

8.1 Summary of Contributions 

The two key issues that need to be addressed in relevance reasoning are (1) how 
to automatically decide what knowledge is irrelevant to a query and (2) what is 
the utility of relevance reasoning. As a basis for addressing these. two issues, we 
presented a formal framework for analyzing irrelevance. The framework included a 
space of possible definitions of irrelevance based on a proof theoretic analysis of the 
notion. The framework enabled us to compare the properties of different irrelevance 
claims. Within the Space of definitions, We identified the class of strong irrelevance 
claims that .has two desirable properties; .namely, strong irrelevance claims can be 
efficiently derived automatically and are guaranteed to lead to savings in inference. 
The framework also shed .new light on the problem of .deciding when a query is 
independent of an update to the knowledge base and enabled us to significantly 
extend -previous results. in this area. 

The framework provided a setting in which we could investigate the connection 
between the notion of irrelevance and the creation of abstractions. This connection led 
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Title and reference 

Page(s) 

Analysis of (ir)relevance: | 

A theory of irrelevance [Subramanian and Genesereth, 1987; Subramanian. 1989] 


In the philosophy literature [Keynes, 1921: Carnap, 1950; Gardenfbrs. 1978] 

39 

Relevance logics [Anderson and Bel nap. 1975; Avrort. 1992) 

39 

In probabilistic reasoning [Pearl. 1988] 

39 

Static Analysis of rules /clauses: 

Connection graphs [Kowalski. 1975; Siekeh 1976; Chang, 1979] 

. 76 

Static analysis in Explanation Based Learning [Etzioni, 1993! Etzioni. 1990] 

75; .97 

Pushing constraint selections [Srivastava and Ramakrishnan, 1992] 

75 

n artial evaluation of logic programs 

Mi 

, mith and Hickey. 1990; Lloyd and Shepherdson, 1991; Bruynooghe et . 199 1] 




Automated reasoning and query evaluation: 

Knowledge compilation [Sejman and Kautz. 1991] 

100 

Deriving optimal search Strategies [Smith, 1986; Greiner, 1991] 

97 

Message passing based query evaluation [Van-Gelder, 1986] 

96 

Magic set transformation [Ullrnan, 1989; Nlumick et al., 1990] 

94 

Independence of queries from updates,: 

Detecting independence [Blakeley et al . , 1989; Elkan, 1 990] 


Conjunctive query containrnent [King, 1988: van der Mevden, 1992] 

126 


Predicate abstraction [Plaisted, 1981; Tenenberg, 1990] 

144 

Projecting existential arguments iRamakrishnan et al., 1988] 

144 

A theory of abstraction [Giunchiglia and Walsh, 1992] 

145 

Automatic creation of abstractions [Knoblock. 1990; Knoblock et al.. 1991] 

146 

Modeling physical devices: 

Compositional modeling [Falkenhainer and Forbus, 1991] 


Causal approximations [Nayak. 1992a] 

169 

Graphs of models [Addanki et al . , 1,989] 

.170 

Model simplification [Weld, 1990; Williams. 1990a] 
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Table 8.1: References to related work. 


to a new approach to research on reasoning with abstractions in which we investigate 
the properties of an abstraction by considering the irrelevance claims on which, it. 
is based; We demonstrated the approach on the cases of predicate abstraction and 
argument projection. In both cases, the analysis of the corresponding irrelevance 
claims led to efficient algorithms for automatically creating abstractions and tc better, 
understanding of the utility of the resulting abstractions. 

We. investigated in detail the problem of automatically deriving irrelevance claims 
for Horn. rule knowledge bases and several extensions. Our analysis was based on- 
the observation that in order for relevance reasoning to be practical, we must derive 
irrelevance claims by considering only a small and Stable part of the knowledge base, 
while not assuming anything about the unexamined parts; We considered the problem 
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of automaticallv deriving irrelevance claims that were based only on the rules in the 
KB and were independent of the ground formulas. As a result, our algorithms were 
efficient and the irrelevance claims derived were independent of changes to the ground 
formulas. 

Our algorithms for deriving irrelevance were based on a novel tool, the query- 
tree. which is one of the main contributions of this work. The query-tree is a finite 
structure that gives us a view of the knowledge- base. It encodes precisely the set of 
possible derivations of the query. Consequently, it tells us exactly which rules aiid 
ground formulas can appear in derivations of the query, thus providing the basis for 
a sound and complete inference procedure, for several classes of strong irrelevance, 
claims. One of the. key. aspects. of. the. query-tree is that.it considers -the semantics 
of the interpreted literals that appear in .the rules, which often enables us to detect, 
additional interactions between the rules. The query-tree can also be built to encode 
only the minimal derivations of the query, or only the satisfiable .derivations in cases 
where EDB literals may appear negated in -the antecedents of the rules. We., also 
showed how the query-tree can be used to derive logical consequences of irrelevance 
claims .that are given to the system by an external source, and to guide the search 
of a backward chainer so that it Callows only paths that can .yield derivations of the 
query. 

We presented experimental results which Showed that using the query-tree to filter 
out irrelevant formulas often yields speedups of orders.of magnitude, while the cost of 
building the query-tree is negligible. Additional speedups were obtained, by using the 
query-tree to guide the search of the backward chainer. Both the theoretical analysis 
and the experimental results showed, that our methods will scale up and be even.more 
effective in larger .knowledge bases. 

Finally, we applied the relevance reasoning framework to the domain of modeling 
physical devices. We considered the task of selecting a model for a device to answer a 
given query by composing model-fragments, each describing a single phenomenon in 
the physical world at different levels of abstraction and using different.approximations. 
We presented a novel model selection algorithm based on relevance reasoning. The 
algorithm used relevance reasoning in order.to ( 1 ) determine which phenomena, are 
relevant to the query (and therefore should be included -in the model), and (2) to 
reason about the abstractions underlying the model-fragments which present multiple 
descriptions of these phenomena in order to determine the abstraction level most 
appropriate for the query. 

8 . 2 Future Work . 

The work, described in this dissertation can be extended, in several ways. In this 
section we describe several directions for future research. 
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Analysis of Irrelevance 

Our framework for analyzing irrelevance should be extended in several ways in order 
to make it applicable in a wider variety of settings. One important extension is 
to incorporate probabilistic irrelevance claims into the framework, i.e.. claims stating 
that some formula in the knowledge base is irrelevant to a query with some probability. 
A clear understanding of the meaning of such irrelevance c’ .ms is needed as. well 
as algorithms for automatically deriving them and methods for exploiting them in 
inference. Second, our analysis focused on the case in which answering a query is 
done by searching for a derivation, using some given set of inference rules. Although 
many problem solving situations can be-cast in that way. doing so will not always yield 
useful results. Therefore, an important extension to the framework is to formalize 
irrelevance for general problem solving. For example, whereas noW the framework, 
revolves around the possible derivations of a query, a more general framework will 
revolve around paths in a state space. Finally, in our discussion we considered only the 
cases -in which reasoning is monotonic.’ An interesting problem is to define-irrelevance 
(and develop the corresponding algorithms) in the setting of non-moiiotomc reasoning. 

As mentioned in -Chapter 2. the notion of irrelevance is. very closely related to 
the. notion Of belief revision. A more thorough investigation of. this connection could 
yield interesting results. On. the one hand, considering definitions of irrelevance based 
On belief revision will yield more semantically based definitions of irrelevance. On 
the other hand, associating a definition of belief revision with irrelevance -may shed, 
light on the plethora of 'definitions of belief revision. Additionally, the problem of 
developing efficient algorithms for belief revision has received little attention to date. 
Algorithms for deriving irrelevance claims might prove to be a key tool in developing 
efficient belief revision algorithms. 

The Query-tree 

The query-tree has proven to be a powerful tool in relevance reasoning and controlling 
inference, and it is therefore interesting to extend it to a wider class of languages. 
One important extension is to widen the class of interpreted constraints that can be 
handled by the query-tree. Currently, except for constraint literals in the rules, the 
query-tree can also fully incorporate constraints that are given oii the arguments of a 
relation in the knowledge base. One Way to extend the tree is to consider constraints 
that include arguments from more than one relation (a.k.a. integrity constraints), 
An example of such a constraint is stating that a join of two (or more) relations 
is empty, Another important extension to the query-tree is to consider rules with 
function symbols: Although in general, the query-tree cannot provide a complete 
inference procedure for strong irrelevance when function symbols are present, it is 
important to find limited cases in which such a procedure can be found. Iri cases 
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where completeness cannot be guaranteed, one could develop methods that will detect 
a wide class of irrelevance claims encountered in practice. 

The use of the query-tree to control inference should also be investigated further. 
The uses we described and with which we experimented were straightforward applica- 
tions of the query-tree. As discussed in Chapter 4, the query-tree also enables gener- 
alization other query-optimization methods such as Magic-sets and message passing 
schemes. Given the query-tree, we are in a position to devise a more general frame- 
work for query-optimization that will incorporate Magic-sets, message passing, tail 
recursion optimizations and pushing selections and projections. 

Finally, a key contribution of the query-tree is that it provides descriptions of 
specialized indices for accessing the ground formulas in the knowledge base. These 
indices are tailored for a specific set of queries. An important question that needs .to 
be addressed .in the context of any large system, is how to combine, the indices given 
by the query-tree with current database indexing techniques. 

Irrelevance and Abstractions 

One of the major areas for future work spawned by this dissertation is the connection 
between the notion of irrelevance and the creation of abstractions, .as described in 
Chapter 6. The approach .proposed there is to associate -an abstraction with an 
irrelevance claim, stating which knowledge is removed in the abstract theory. Ah 
understanding of the abstraction is obtained by an analysis of the' corresponding 
irrelevance claim. Chapter 6 -listed several kinds of irrelevance claims that should 
each be investigated further. Of particular interest are the questions of (1) finding 
algorithms for automatically creating an appropriate abstraction, (2) understanding 
the utility of reasoning with the abstraction and (3) determining when and how 
abstractions can be composed (by composing the corresponding irrelevance claims). 
As we saw in Chapter 6. our treatments of irrelevance of predicate arguments and. 
predicate refinements had many similarities. A longer term goal is to develop a general 
framework for. treating a large class of irrelevance subjects. 

Modeling of Physical Systems 

Compositional modeling is a powerful paradigm for building systems that reason 
about physical systems. However, the basic building blocks of the approach require 
a much better .understanding. Specifically, we need a clearer definition of what is a 
model-fragment, i.c., what phenomena should be considered a single model fragment, 
and what assumptions can we -make about model fragments. Second, we need to 
understand how to build libraries of model fragments. Such libraries are not random 
collections of model-fragments, but father have a lot of implicit and explicit struc- 
ture. Significant leverage in devising compositional modeling algorithms \vill come 
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from discovering the underlying structure and exploiting it. The work described i.n 
Chapter 7 makes a few contributions in this direction, but much work remains to b? 
done. Building large experimental systems will provide significant insights into these 
issues. 

Finally, we cannot expect a model-fragment library to contain model-fragments 
that describe a certain phenomenon at all possible levels of abstraction that may 
be needed in solving queries. Therefore, an interesting research problem is to auto- 
matically create model-fragments with the desired level of abstraction by abstracting 
model-fragments from the library, 

8.3 Final Word 

The work described in this dissertation is at the border between artificial intelligence 
and database systems.. I believe that- research combining techniques from these two 
fields will be of prime importance in future years. One of the major technological 
innovations in upcoming years will be the availability of large amounts of information 
in practically every- household. Developing systems that will provide intelligent access 
to information presents a unique opportunity in which-techniques from both artificial 
intelligence and databases will make a great impact on society. I believe that the 
combination of techniques from these two fields, as. demonstrated in this dissertation,, 
will provide the essential building blocks of successful systems of the information age. 
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A.l . Proofs of Chapter 3 

Proof of Theorem 3.4 

Proof: In the. proof it is more, convenient to refer, to -the relations denoted by the 
labels, rather than the labels themselves. The conjunction of two labels cj and ii is 
represented by the join of their corresponding relations, denoted by 1x3 R CJ . Recall 
that since-the constraint language satisfies the Closure property, we can express a 
join of two relations, and a projection of a relation on a Subset of its variables as 
a sentence in the. given, constraint language C. .We denote the relations represented 
by c 0 (u),c&(?i) and <?•/(»)• by /?o(n), Rb{r<) and Rf(n.) respectively. We. denote the 
projection of a relation R on a subset Of its variables .V by R\$, 

Let ri. . . . v r; be the top-down ordering of the rule-nodes in d that was -used in 
the Second phase of the algorithm. Recall that by the definition of q .(the global 
constraint On the variables in d), q = Co(»V) A ... A ce(n), dr in notation of relations, 
Rtl — Rco{r\) M ^ ^Co(r,)- We define a sequence of relations as follows: 

« Ro = R b [fdot{d)) 

; R, ~ R,~\ Rb(r,) 

We prove tliAt the following properties hold for the sequence Rot 

D i : /?/ “ Ri< he,. the final relation is the same as the global constraint oft d. 

D‘2: If .V, is the set of variables that appear in /?,; then 

1)2.1: /f s+i Ly, = R, and 

02.2: R t \r, = /?/(»\). 
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This means that once a set of variables appears in an intermediate relation, the 
subsequent relations do not change with respect to that subset. 

The theorem follows from properties D 1 aiid D'2 as follows. Let r, be the i ,h 
rule-node in d and let .Y, be the variables that appear in the father or the children 
of r,. It follows from D2 that R/(r,) = Ri\x r and by Dl. that Ri = R d . Therefore. 
R/(f t ) = Rd\.\, holds which is-tlie way /?/(?’,) is defined. 

To prove D 1. we observe that Ri = /?a(r t ) M . . . tx! R b (rt) (i.e. the join of the 
labels obtained in the bottom-up phase). Therefore, it is enough to show that R d = 
R b [r i) CX5 . . . tX R b [rt). To show this we prove that for every rule node r in the tree 
and its father node g the following hold: 

(a) R b {r) C /? 0 (r ), R b {gXJ2 flo(aLand 

(b) R b {r~) D R d | r . R b (g) D R d \ s 

Sine'- R d = Ro{ f \) M • • • txl Ro{vi). (a) gives us that Ri C R d , From the properties of 
the join Operation and from (b) we get- R\ 2 R d . Hence, R\ — R d -. . 

The proof of (a) and (b) proceeds by induction on the elements of rq., in 

reverse order (i.e.. the bo‘ tom-up order). Note that the second parts of (a) and (b) 
follow from their first parts, since R(g) is the projection of. R{r) on the variables in 
g . The base case consists of all the rule Jiodea whose children are all leaves^ For each 
such-node r. R b (r) = '/?o(f) and therefore, (a) and. (b) hold trivially. 

Assume (a) and (b) hold for all rule nodes r t+! , . . . ,rq. We need to show that it 
holds for r t . Claim (a) holds because R b (r,) is the intersection of Rq(?,) with the 

bottom-up labels of its children. To -prove (b). let g\ be the children of r t . 

By the induction assumption. R b (g,) 2 Rd\ s , for each subgoal g,. Clearly, bv the 
definition of R d | r ,, /fo(r,) 2 Rd }r, • Therefore, since R b (f t ) is actually the join of 
relations that all contain flair,, 1 /?&( ) 2 flifo. 

We prove D'2 by induction on i. For the base case i = 0. we note that R t is simply 
R b (r t ), Therefore, since the R b (root(d)) is the projection of R b (rD on the variables 
of rooi(d), D2.1 holds. Moreover, since fl/(n) — Rb(r i). D2.2 holds too. 

We assume the claim holds for all j < i, aiid prove that it holds for i. We first 
prove D2.2. Let g be the father of r t+) and r, be the father of g. By the inductive 
assumption, fl,| r , = Rj{r , ). and therefore the same holds for the goal node p, i.e., 
R t i 7 -• Rf {g) Moreover, 

= Rj{g ) M Rb{r,+ 1 ) =- Rb(>\+ i)-. 

'More precisely, the relations R b { su). . /Ibfsot). Ro{r,) contain itie respective projections of R d 

on their variables — 
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However, since the variables common to r ,+ 1 and to /?, are only those in g, the join 
add the projection commute, i.e.. 


R/{f t+l ) = (/{, Cxi /? 4 (r t+ 1 ))|r l+1 . 

To show D2.1. it is enough to show that /?,| 9 = R,+\\ a , since the variables ap- 
pearing in g are the only ones common to /?, and ftb[r,+i), or equivalently, we. can. 
show 


R/(r, +l )\ g =* Rj(g). ( A.l ) 

The proof uses the following observation: 

[f A is the projection oj a relation R on a subset of its variables, and B C -4, then 
joining B with R and projecting on the sarrie variables will result in the relation B. 
In. our case, 


fib('\+i)ls - (A. 2). 

Clearly, R;{r) C Rb{r) and Rb{r) | p C Ri,{g), and therefore, .since. R/(g) — /?/(r)j a it 
follows that 


Rj(g) c Rb(g)..„ 


(A.3) 


Finally, recall that 


tf/(r,+i) - Rj[g) M Rb{r x+i ). (A. 4) 

Therefore, the above observation together with A. 2, A.3 and A. 4 entail A.l. I. 


Proof of Theorem 3.10 . 

We begin by defining an intermediate language, £,, which is less expressive than 
£ A,V but more expressive than £ A , and will have the Closure property, We denote 
by £< the language that allows only the predicate < and conjunction, and by £^‘ v 
the language that allows both disjunction and conjunction but only the predicate yC 
Note that all the predicates caii be expressed by conjunctions of < and 

Definition A.l: The language £j contains all the sentences of the form <pA where 
6 £< and 0 € ££' v '. I 

Lemma A. 2: 'The language £, has the Closure property. 
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Proof: The join of two relations represented by the two sentences ch A :/| and 02 A t ’2 
is simply the sentence (Oi A 0-2) A ( cm A ). which is in C s . Selection can be expressed 
by simply adding conjuncts of the form A', = A"; or A', = c. To show that C, is closed 
under projection, let c = o A w we a sentence in £.,. We can assume that v is in 
disjunctive normal form. Let .V be a subset of the variables in c.. 

The language £< is has the closure property (it simply represents a transitive 
relation on its variables). Therefore we have a sentence oy describing the exact 
projection of o on .V. 

Given a tuple a that satisfies oy. it will be in the projection of c on X if it can 
be extended to the variables in c in such a .wav that 0 is satisfied. However, we note 
that a contradiction between an extension of a and v can only arise from that fact 
that the extension satisfies too many equalities. 

Therefore, in order to construct a sentence that is equivalent to c|y. all we need to 
do is check all .the possibilities for equalities between the variables of e. and exclude 
the ones that contradict.:.’. Specifically, let k be a partition of the variables X and the 
constants appearing in c, and .let c-(k) be the conjunction of all the atoms A' = Y\ . 

where A' and V are in the same partition. Let k\ k n be th e partitions for which 

c A c = [k) is unsatisfiable. We define cy by - 

cy - a -^(/q) A ... A •"’Cs ( kn ) 

It is. easy to see. that any tupje not satisfying this formula will not be a member 
of cx (since it either violates dy , or belongs to one of the unsatisfiable partitions).— 
Any tuple that does satisfy this formula satisfies <py, and furthermore the equalities 
that it satisfies are consistent with c. I 

Suppose d is a symbolic derivation tree and we are computing the labeling L ia i-. 
Recall that in the rules we Can only have conjunctive constraints (i.e., the conjunction 
of the literals of interpreted predicates). These Constraints are therefore expressible 
in £.,. Furthermore, labels L,- a( are Computed by join and projection operations on 
these constraints, and therefore, since £ , has the Closure property, all the labels in 
L , a [ can be expressed in £,. This observation enables us to characterize the difference 
between the labels created by £ A and £ A,V : 

Lemma Ai3: Let d be a symbolic derivation tree and lei n € d. Lit Q>(n) = <pA 
be the bottom-up label df n, where 6 € £-£ and w €'££ ,v ; then c^(n) = d> A i n . wh ere 
v [= vq. The safiie relationship holds between C/(n) arid c/{n). 

Proof:. The key -observation underlying the proof is.the following. Suppose c = d>.A w 
is a sentence in £,,.artd A' is a subset of its variables. Let cy ; = <t>\ A il'\. be the. 
projection of c on A\ Suppose that f is the sentence in £ A that is most close to (still 
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Weaker than) c$ (note that "most close" is well defined and unique). If r is written 
in C s as #2 A V 2 , then 0\ = o 2 and tq j= t’ 2 - 

Therefore, by projecting a sentence into £ A we only make the ^ part weaker. 
Based on this observation, we can prove the lemma by a bottom=up induction on the 
nodes of d followed by a top-down induction. 

For the leaf nodes, the claim is trivially true, since c b {ii) = cf'(n), Consider a 

rulernode r, and suppose the claim holds for all subgoals of r, g x g m . The label 

c A (r) is computed by conjoining c(r) and c A (<?i) c A (< 7 m ). Since the conjunction 

can be done separately for the two components of the labels and since the labels of 

3 ! g m satisfy the inductive assumption, the claim will also hold for Cb{r). To 

compute the label of the father g of r, we project cf'(r) on the variables of g. Based- 
on the observation above, and since the claim holds for r, the resulting label of g can 
only be weaker than Cbig) in the second (^) component. The proof is completed by 
a top-down induction on the nodes of d in a similar fashion. I 

P.mof of Theorem 3.10: Part l of the theorem follows directly from Lemma A. 3. 
Part 2 can be shown as follows. Suppose that cj(n) = <? A 4' and .that c'j(n) = <£lA C’i 
such that j= vi- We can assume that is in disjunctive normal form and at least 
one of its disjuncts (assume it is the first) is satisfiable in conjunction with <$>. Suppose 
its first disjunct is >pi A . . . A -’p*. where each of the p,’s is an equality. Since d>Aui 
is satisfiable, that means that o ft P\ for any 1 < i < l. .Therefore, -'p, will be a 
conjunct in Max^c^n)) (note that A/a^c^n)) can be computed using <$> alone). 
Consequently Max^cjin)) j=. and therefore Max^ic'jin)) \= <p A rj>. 

Finally, the third part of. Theorem 3.10 is proved as follows. Starting from the 
root of d. we show that if all the labels Of Cj[n) are satisfiable, then we can construct 
a mapping U' of the variables of d to constants that satisfies all the constraints in the 
rules. Therefore, .if We have such a variable mapping, it must be the case that c d is 
satisfiable and therefore c/(n) is satisfiable for every n € d. 

We begin by assigning values to the variables that appear in the root of d , in a 
way that is consistent with cj(root{d)). We assign distinct values to variables X t and 
Xj unless cj(rdot(d )) implies .that A\ = X } (and unless .V, = a is implied, we assign 
to .Y, a value other than a). 

We construct the variable mapping 0 in a top-down order on the rule-nodes in d. 
Let f be a rule-node with father g and interpreted subgoals c r . We assume that the 
variables of g have already been assigned values that are consistent with tfig) and 
contain. only equalities that are ifriplied by We extend t/’ to the variables that 

appear in subgoals of r but. do not appear in g. In doing so, we choose values that 
are consistent with c = c'j(g) A c r , but. only contain equalities that are implied by c. 
To complete the proof, we need to shoW: 


1. The constraint c is Satisfiable 


m 
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The first condition guarantees that c can be satisfied, and the secotid guarantees 
that it can be satisfied by extending the mapping created thus far (for the variables of 
g). To prove 1. recall that cjir) is satisfiable. However, cj(r) was computed by first 
computing .ci = c'jig) A c£(r). and then finding the. strongest constraint in £ A that is 
weaker than ct. Therefore Ci must have also been satisfiable. Moreover. c£(r) j= c r . 
and therefore, c is satisfiable. 

To prove 2. suppose c'jig) = Oj A c'i when written in £,. and suppose c - q 2 A v?. 
Recall that 0 2 | 5 = Oi because the language £< has the Closure property. Moreover. 
gv satisfies c'jig) and has only equalities that are implied by c'jig). Therefore, gv 
will satisfy c? and will therefore satisfy c\ 3 . I 


Proof of Theorem 3.26 

We begin, by Considering the case of Slir. q. DI 2 . V q ) when the rules have the 
predicate The. theorem is proved by reducing the acceptance problem of a linear- 
space alternating Turing machine (ATM) [Chandra et a/., 1 981 j to the problem .of 
finding irrelevance of rules. -The execution of an ATM is described by a sequence. of 
instantaneous descriptions, id's, each describing the state of the. machine at consec- 
utive stages of the execution, i.e., the contents of the input tape, the location of the. 
head and the state of the machine. An ATM is similar to a Turing machine, except 
that its transition function .gives a pair of moves for each combination of state and 
symbol. Furthermore, every state is either an and-state or an or-state. If q is an arid- 
state, then an id having state q leads to acceptance of the input if both its successors 
lead to acceptance. If q is an or-state, then an id having state q leads to acceptance if 
either one of its successors leads to acceptance. .The states of the machine alternate 
in the sense that the successors of an and-state are or-states, and the successors of 
an or-state are and-states. 

Instantaneous descriptions are represented by a symbol for every cell on the tape. 
The symbol can either be an input symbol, or a composite symbol including an 
input symbol and a state of the machine. In a legal id ; all cells but one contain the 
input symbol that is on the tape in that state, and the cell on which the head is 
placed contains a composite symbol containing the input symbol in that cell and the 
internal state of the machine. The union of the input symbols and. composite symbols 
is denoted by B. 

The reduction is based on representing id’s as tuples of a predicate id , whose arity. 
is iinear in the size of .the input tape, 7?. Each cell in the. id is represented by a 
block of variables of size \B\. The variable A' appears in the block in the position 
corresponding to the symbol appearing in the cell (we assume some arbitrary ordering 


,U. PROOFS OF CHAPTER 3 


IS 5 


on the elements of B). All Other columns contain the variable \V. Thus the arity 
of the predicate id is \B\n. The tuples A, are used to denote blocks .of variables 
corresponding to one cell. The tuple denotes a block of variables representing a 
Cell with the symbol (i.o., A' appears in the position of i in B and all other positions 
are \V). 

Intuitively, we construct, the program such that id{X) is derivable if and only if 
.V describes a legal id and leads to acceptance. Given an ATM, A/. and an input 
A we construct a program as follows. First we need rules representing transitions, 
between consecutive states. Suppose 6(c,q) — {((/]. S[. /?), (do, s 2 -£)} 2 is a transition 
of A/. If q is an or-state. then for every t.(l < i < n) 3 and every input symbol b, the 
program Contains the following rules:'* 


id{X \. . 

• ■ » d 1 ' ^ (aj .6) • • * 

— — V„) =$► id{Xj , . . 

-A,-i< 

^ ^ 6 * 1 + 2 ' • • 

-An) 


• » v*V-2« 1 ( ^2 ,6} ' ^ di * 1 + 1 * 1 

,V„) =» fd(A’i , . . 

-A\- 2 . 

f'6- f'(c.q)2.Ai+l , • • 

-An) 


If q is an and^State then for every /, ( 1 < / < n) and every pair of input symbols 6j, 6 2 , 
the program contains the following rule: 


t d(.A i , . . * . A | _2 . I b[ • f di., f (si ,62) i "A 1 + 2 , * * • , A rt ) A 
id{X 1 , .... A 1 _ 2 . L (32.6, p C<{ 2 < A 1+ 2, . . ., A n ) => 

id{Xi . .1 , . .Vj_2, ffc, . I tj> J, .V, + 2 , . . . , A n ) 

To complete the -program, a few more rules are necessary. Denote by A'/tnai .the ■ 
tuple representing A/'s (unique) accepting. state, and by A mi( the tuple representing 
the initial state. R1 places the initial state as. a subgoal of the query: 

Rl: id(X, nit )^p(XAY) / 

R2 and R3 will lead from the accepting s^ate to an EDB node. Note that e is the only 
EDB predicate. 

R2: a(XAV)=>id(Xj, nal ) 

R3 : (.V # W) A e(.V, IT) =>.<i(A.H‘) 

We denote the set of rules bv V. The following theorem establishes the correctness 
of the reductions. 

2 The L and R are arbitrary. 

3 tf i = n, the head cannot move to the right, if i = 1, it cannot move to the left. 

4 The exact form of the rule depends oh- the direction_of the head movement. These rules are 
shown, to rcfiedt the transition shown. 
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Theorem A. 4: 5/(Rl,p, Ep, Dli.Vp) holds if and only if M doef not accept its 
input with the initial state .V in n- 

Proof: We first note that any derivation of p(A\ U") will contain only two constants. 
This follows from the fact that in all rules, all variables appearing in the body of 
the rule appear in the head too. Therefore, the only constants in the derivation 
will be those assigned to .V and IT. Furthermore, since a derivation must include 
the rule R3, these constants must be distinct. Therefore, we will refer to them as 
X and W hereafter. Moreover, if X and l V are distinct, an id subgoal can Only 
be unified with itself in the head of a rule. This can be seen by considering each 
block in the id. If they differ in the position of the A' variable (or if one of them 
does not contain exactly one occurrence of ..V), it will force A' and IT to unify 5 . 
Therefore, because of the way the transition rules, are written, the subgoals of any id 
node are the instantaneous descriptions of its successor. states. Consequently, if the 
top. id .goal-node .in .a derivation. is the node describing the initial state, then every 
partial derivation of p describes a possible execution tree of the ATM M. Therefore, 
because the only wav to get to and EDB subgoal is through rules R2 and R3, every 
derivation of p must describe an accepting execution trace of M. Therefore, if p has 
some derivation, then there -then there is execution of M that will accept A', nl{ . 

Conversely, suppose At accepts its input. A simple trace of the machine's execu- 
tion will produce a. symbolic-derivation of p(A, IT) which must contain Rl. I 

To show .the claim for 5/(r, q , Ep. Z)/ 2 . A/1), we modify V as follows. We replace 
the rule R3.by 

R3 ' : e(A',U’) => a(AMi') 

and we add the rule 

R4 : a(H'.A') =* a(.Y.H’). 

The reduction follows from the following theorem: 

Theorem A. 5: S/(R4,p.Ep, D/ 2 . A/1) holds if and only if M does not accept its 
input with the initial state A' tril( , 

Proof: The proof is similar to that of Theorem A. 4, with the following differences. 
In any minimal derivation of the query that uses R4 the variables A' and W must be 
distinct. Moreover, every derivation of the query that does not use R4 can be modified 
to use R4. We simply apply R4 to a subgoal of rule R2 and R3 to the subgoal of R4. 

5 This assumes each block is at least of size 3. If this doesn't hold, we simply add another dummy 
column to each block, and leave it unchan ged in all the rules. 
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Therefore, if there is a minimal derivation of p that includes R4. then M will accept 
its input. 

Conversely, suppose M accepts its input. We can assume the machine does not 
enter a state with an id identical to one of its ancestors. A simple trace of the 
machine's execution will produce a minimal symbolic-derivation of p( X, H'). and like 
before, it can be made to contain R4. 1 

Finally, it should be noted that the size of the program V is linear in n. the size 
of the input tape of M. therefore the construction takes time linear in the size of the , 

input. 

] 

A. 2 Proofs of Chapter 4 j 

I 

Proof of Theorem. 4.6 j 

In proving the theorem we will use the following lemma: j 

1 

Lemma A.6r Let P and Q be two datalog programs such that P 2 Q, i.e., for any | 

given database D , the answers obtained for the query predicate of P .is a superset of 

those, obtained for. the query predicate ofQ. The problem of determining whether. P 

and Q an equivalent (i.e., produce the same answer for any database D) is undecid- . 

able. — j 

| 

Proof: In [Shmueli, 1987] it is shown that .determining whether two arbitrary datalog : 

programs are equivalent is undecidable. Suppose there is an algorithm A to. determine i 

equivalence of two programs P and Q when it is known that P. 2 Q. Let 5 and R 
be two arbitrary datalog programs, and assume (without loss of generality) that. they 
do not share any IDB predicates. Let G be the program consisting of the rules of 5 
and R and the following rules: 

s(X)*g(X) ' 

r{X)±>g(X) _ 

where g is the query predicate of G. The result of G is the union of. the results of 
5 and /?, arid therefoic,. G 2 S and G 2 R. Clearly, the programs S and R are • 
equivalent if and only if I 

• G and 5 are equivalent and 

• G and R are equivalent. . ] 

Each .of these can be determined by the algorithm A. Therefore, the existence of. ; 

A leads to a. contradiction. I j 
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Based oti this lemma we tan prove Theorem 4.6 as follows. 

Proof: We will Show that if determining 5/(0|. q. E'. DIi.T> q ) is decidable, this will 
contradict Lemma A. 6. Let P and Q be two datalog programs with query predicates 
p and q respectively and such that P 3 Q, and assume without loss of generality that 
P arid Q have no IDB predicates in common. Consider the following program G that 
includes the rules of P arid Q and the following rules for its query predicate g : 

r i : p{X) A r(.V) =* g{ X ). 
r 2 ■ < 7 (-V) A c(.V) => g(X). 

where e is a new EDB predicate that appears nowhere in P or Q arid has the same . 
arity as p and q. 

To prove the theorem, we establish the following claim. Let I be the irrelevance. 
Claim that states that r 2 is st-rongh irrelevant to g. Then I =* SL(r v ,g,E' , 
if and only if P and Q are. equivalent. 

If P and Q are equivalent, then the join of q and e is empty exactly when the. 
join of p and e is empty. Therefore, if D is a database in which r 2 is not used in any 
derivation .of g, then rj will not be used in any derivation of g either. 

Suppose P and . Q are not equivalent, i.e., P’D Q, and P ^ Q. Then, there is 
some database D in which the difference between p and q (denoted by p — q) is not- 
empty. Consider the database./)', that consists of D and the facts e(A') for. X such 
that X .€ p — q. The database D' is such that r 2 will not. be used in any derivation of 
g^ however, rj will be used. Therefore, / =a S l(r x ,g,t.' y D/ 2 .P ? ) cannot hold. I . 


A. 3 Proofs of Chapter 6 

in our proofs we use thejollowing lemma that is proven by Plaisted [Plaisted, 1981]: 

Lemma A. 7: Let f be a mapping on literals, u'hich is extended in a straightforward 
fashion to a mapping on clauses , Suppose f satisfies the following properties: 

1. f(-'L) = ~'f(L) for any literal L. 

-. If C and D ate clauses and D is an instance of C . then f(D ) is an instance of 

/(C). 

Lit C\ and C-i be two clauses and C’3 be one of their resolvents. Then the clause 
flC'j) is subsumed by some reso Ivent of JlCxfMldJ (Ci). 
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This lemma implies the following proposition: 

Proposition A. 8: 

/. Let Ci And C i be clause* that are independent of the predicate arguments Tv. 
and suppose C 3 is a resolvent of C\ and C-j. Then /r(C.») subsumed by some 
resolvent of fn(C\) and fz(C 2 ). 

J. Let Ci and (b be clauses that are independent of the predicate refinement Q. 
and suppose C3 A a ri solvent of C\. and C 2 ■ Then /o(Cj) is subsumed by some 
resolvent of f^[C\)and f^[Cf). 

Proof: The. proof follows from the observation that both mappings, fv. and /$ satisfy 
the conditions of Lemma A. 7. 1 


Proof of Theorem 6.5 

Proof: The first half of the theorem follows from .Proposition A. 8 . Let D be the 
derivation for which Dl{ 71. D) holds. By the proposition, if Base[D) h C then there 
is some clause C that subsumes fn[C) such that fn{Base(Df) h C' . Therefore. 
MBase{D))h f K {C). 

For the converse, suppose /r(A) }=•/?;(<?) and let / be a model of A,_.i.e., / (= A. 
We need to show that /.[ ~ q. By the definition of independence, Abs(I) f= .4i>sn(A) 
and therefore, Abs[l) f== fn{q). However, since q contains no irrelevant arguments, the 
relations denoted by predicates occurring in q and identical to the relations denoted 
by predicates occurring in fn{q ). and therefore I j= q. I 


Proof of Theorem 6.7 

Proof: Suppose that A1-A3 hold as required. Let / be a model of A (and therefore, 
f [= C), and let p' be an arbitrary variable assignment to the variables of / 71 (C). To 
show that Abs(I) fn{C) we need to show that Abs(I) f= fn(C)p'. 

Note that if p is an arbitrary extension of p' to the variables of C, then since 
/ \~ C holds, then / f= Cp holds. Therefore, I satisfies at least one of the literals of 
Cp. There are three possible cases; 

Case 1: There is some extension p, such that one of the satisfied literals L is of a 
predicate p (cither positive or negative) and p does not appear in 71, In this case, 
fa(.L) = l holds and therefore -Abs{ l) fu{L) holds because the relation denoted 
by p in / is the same as denoted in Abs{I). 

Case 2: There is some-extension p< such that one of the satisfied literals is a positive 
literal p(.V). where p appears in 7Z, and fn(p{X)) = p'(V). If l h P{X)F then 
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Abs(l) j= p'(Y)p' since Yp 1 is a projection of A'// and the relation denoted by p' is 
the corresponding projection of the relation denoted by /). 

Case 3: If neither of the first. two cases happen then it must be the case that for 
every extension p of p' the literal in C that is satisfied is of the form — »/)( A' ) where 
p appears in 7Z. Again, we denote fa ( — >/>( A' ) ) by - | p'(V). In order to prove that 
C is independent, we heed to. show that there is a single literal ->;),( A',) € A 'eg(C) 
such that for any extension p of p\ l (= ->p,(.X, )p. If there exists such a literal, then 
Abs(l) ft ~'p[{Y,) holds because the projection of the relation denoted by /?, on Y,p' 
is empty. This follows from the fact that no constants appear iti argument positions 
of ~'p,[X t ) that are projected.. 

To prove that there exists such a literal, assume the contrary. That means that 
we. assume that for every negative literal .-'/>,( A',) 6 C such that p t appears in 7 
there is some extension // of p' such that / }= p, ( .V, )p. To show the contradiction, we 
will build an .extension p 0 of p' such that I ft Cp a . 

If (p.J) 6 FL and .V is a variable such that A' !.€ AtPos(p,i ), then .V does not 
appear in This is because it has only one appearance in Xeg{C) (by A2) 

and all its appearances in Pot(C) have been projected out (by A3).. Therefore, in 
extending //' to p 0 we are free to assign a value to. A'. If -</>,(. A',) 6 'Neg(C), then in 
pu we. assign to the variables in A', - V,'’"the values that make p,(. V, ) Satisifable irt 
/. Note that such ari assignment exists because of our assumption. Furthermore, the 
choice of assignments for the variables in A', - V"; does not affect variables in. the other 
literals of A cg(C) (because of A2). and therefore can be done independently for every 
literal in Xeg(C). Variables in \'ar(C)~\'ar[fri{C)) that appear.oiily in Pos{C) are 
assigned arbitrary values. Because of our assumptions, none of the literals in .Xeg(C) 
are satisfied with the variable assignment p 0 , Moreover, none of the positive literals 
are satisfied because neither cases 1 or .2 occurred- Therefore, I ft C holds which is 
a contradiction. I 


Proof .of Theorem 6.9 

Proof: The first half of the theorem follows from Proposition A. 8. Let D be the 
derivation for which D/(£, D) holds. By the proposition, if Base{D) b <t> then there 
is some clause <t>' that subsumes f$(&) such that f${Basc(D}) k-<V. Therefore. 
Base( D)) f* /${&)■ 

For the converse, suppose / tf ( A) ft f Q (q) and let / be a model of A, i.e., / ft A. 
We heed to show that l ft q. By the definition of independence Abs{I) |= AbsQ{ A) 
holds and therefore, Abs[l) f^(q). However, since q docs not contain predicates 
from Q , the relations denoted by predicates occurring in q are -identical to the relations 

r ’.V, - V, denotes tfip variables that appear in ,V, and not iti V, 
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denoted by predicates occurring in f^q)-. and therefore /[=<?•■ 


Proof of Lemma 6.10 

Proof: Suppose the clause C is independent of the predicate refinement Q. Let 
.Ve^(C’)' be the result of replacing occurrences of predicates in Q by arbitrary other 
predicates in Q. We define (\ as follows. C> includes the literals in Pos(C) as well 
as the following literals. If p(X) 6 Pos(C') arid p € Q. then i\ includes the literals 

< 7 ,(.V) q n ( .V ) . We denote by C' the clause containing the union of literals in 

Xeg{C)' and CV Note that f${Ci) = f$(Pos(C)). We show that A \= C. 

Let / be a model of A (and therefore of C) and let p be an arbitrary assignment 
to the variables in C (which are the Same as the variables in. C'). We need to show’ 
that. / }== C'p. 

Clearly. / Cp and so. some literal in Cp is satisfied, by /. Ifihe satisfied literal 
is either positive or involves a predicate, that does not appear in Q, then the_same 
literal will appear in C' and therefore. / ^ C'p. 

Otherwise, the satisfied literal is of the form ~<qR X ), where q, € Q ■ In C' the literal 
-’< 7 , ( A' ) is mapped to - > <? J (.V). Recall that by our assumption. Abs{I) [= /q[C), and 
therefore. Abs(I) )= f$(C)p. Let L = r(V') be a literal (either positive or negative) 
satisfied in /^(C)/u._There are three cases: 

Case 1: There is a satisfied literal .such that r £ Q. In this case, the literal f(V') 
appears in C arid C and therefore I \= C'p, 

Case 2: There is a satisfied positive literal L of the forhi q[X) w'here q is the new 
predicate. This means that for. some q , € CL q,[X)p € Q,. where Q , is the relation 
denoted by q l in /. The literal q,(X) is. also in CV and therefore f [= C'p. 

Case 3: The satisfied literal L is negative aiid of the form -*<ji( A'). This means that 
for all predicates q t € Q. q>{X)p £ Q ,. In particular, it is true for q } < and therefore. , 
the literal -> q } {X)p is satisfied, and ! j= C'p. 

For the other direction, let C be a clause and assume the condition of the lemma 
holds. Let / be a model of A. We need to show that Abs(I ) (== Jq{C). 

Let p be an arbitrary assignment to the variables of C. W ? e need to show that 
Abs{l) f${C)p. Clearly. I f= C p. There are three cases: 

Case 1: If one of the satisfied literals involves a predicate that is not in Q, then that 
literal will also appear identically in /q[C) and the relation denoted by the predicate 
of the literal will be the same in./ and Abs{l). Therefore. Abs(I) /q(C)p. , 

Case 2: If one of the satisfied -literals is a positive literal of the form ( A' ) , w’here. 

<?, € Q. then corresponding literal in /q{C) will be q{. V). However, since the relation 
denoted by q in Abs[l) is includes the relation denoted by q, in /. then Abs{l) will 
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satisfy q(X)g. Moreover, if there is any literal of the form 7, ( -V ) , where q, £ Q such 
that I (i g,(.Y)/i, then Abs{f) [=* q(X)g. 

Case 3: If neither of the previous cases occurred, then the set of satisfied literals 
must be of the form ~ , < 7 i(.Yi), . . . , nr/r-( AT), where <7, 6 £. for 1 < z < k. To complete 
the proof by showing that for at least one of these literals, Abs{l ) -^(.Y, ). Suppose 

the contrary, i.e., for each of these literals Abs{I) <?(.Y,). This means that for z, 1 < 
i < k there exists a predicate such that q^ € Q and X,g € Q fl (o- Consider the 
set of literals Xeg{C)'. obtained by replacing <7, ( .V, ) by <? 3 (,)(. Y,). By the assumption 
of the lemma, there exists some set of literals C2 such that /^(Cj) = fQ{Pos{C)) 
and A f= .'Ve<7(C)'U C2 and therefore, / f== (Xeg(C)' Li C-i)g. Clearly, each of the 
satisfied literals in.(.Veg(C)'U CT)/* must be a positive literal involving a predicate in 
Q. However, since /^(CT) = Pos(C)), this would contradict the fact that case 2 
did not occur. I 
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